AtDork Advanced Features Guide¶
This document covers three powerful features that make AtDork a professional OSINT tool: Post‑Processing Hooks, Cache System, and Database & History.
1. Post‑Processing Hooks¶
Introduction¶
Post‑processing hooks allow you to run external commands on every URL discovered by AtDork. This is useful for automating vulnerability scanning, header grabbing, screenshotting, or any other action you want to perform on the results without manual effort.
Functions¶
| Flag | Description | Default |
|---|---|---|
--exec "command {}" |
Execute a command for every result URL. {} is replaced with the URL. |
|
--exec-on-vuln "command {}" |
Execute only on URLs flagged as vulnerable (requires --filter-vuln). |
|
--exec-parallel N |
Number of parallel processes. | 1 |
--exec-timeout N |
Timeout per command in seconds. | 30 |
Usage Examples¶
# Check HTTP headers of every discovered URL
atdork -q "inurl:admin" -r 10 --exec "curl -sI {} | grep Server"
# Run WPScan only on WordPress-related results
atdork -q "inurl:wp-content" -r 30 --filter-vuln wordpress \
--exec-on-vuln "wpscan --url {} --enumerate p" --exec-parallel 3 --exec-timeout 60
# Save all URLs to a file for later use
atdork -q "site:example.com" -r 50 --exec "echo {} >> urls.txt"
How It Works¶
- After all results are collected and filtered, AtDork extracts the URL (
href) from each result. - If
--exec-on-vulnis used, only URLs that pass the vulnerability filter are selected. - The command template is filled by replacing
{}with the (shell‑escaped) URL. - Commands are executed using Python’s
subprocess.run()in a thread pool (if--exec-parallel > 1). - Stdout, stderr, return code, and any error are captured and logged.
- A short summary (success/failed/timeout) is printed after all commands finish.
2. Cache System¶
Introduction¶
The cache system stores search results locally in a SQLite database. This prevents redundant requests to search engines, reduces bandwidth, and allows offline access to previously fetched results.
Functions¶
| Flag | Description | Default |
|---|---|---|
--cache |
Enable caching. Every search result is saved and served from cache when possible. | |
--cache-only |
Only use cached results; never contact search engines. | |
--cache-ttl N |
Time‑to‑live in hours. Cached entries older than this are ignored. | 24 |
--clear-cache |
Delete all cached entries before starting the session. | |
--cache-db PATH |
Specify a custom cache database file. | atdork_cache.db |
Usage Examples¶
# Cache all search results for 48 hours
atdork -q "site:gov filetype:pdf" -r 20 --cache --cache-ttl 48
# Use only cached data (offline mode)
atdork -q "site:gov filetype:pdf" -r 20 --cache-only
# Clear old cache and start fresh
atdork --clear-cache
# Use a custom cache location
atdork -q "test" --cache --cache-db /path/to/my_cache.db
How It Works¶
- A
SearchCacheobject is created when any cache flag is used. - A SQLite table
api_cacheis created (if not exists) with columns for query, engine, parameters (JSON), results (JSON), timestamps, and hit count. - On cache write: after a successful search, the query, engine, normalized parameters, and the result list are stored with an expiration timestamp.
- On cache read: before a search, the cache is checked for an exact match (query + engine + params) that hasn’t expired. If found, the cached results are returned immediately.
--cache-onlyskips the network call entirely and returns empty if no cache hit.--clear-cachedeletes all rows from the table.- Expired entries are automatically cleaned up on initialization.
The cache key is built from: - Query string - Backend engine - Parameters: region, safesearch, timelimit, max_results (all normalized to a sorted JSON string)
3. Database & History¶
Introduction¶
AtDork can persistently store all queries and their results in a SQLite database. This enables resuming interrupted batches, viewing search history, deduplication across sessions, and exporting everything to JSON/CSV.
Functions¶
| Flag | Description | Default |
|---|---|---|
--db-path PATH |
Database file path. | atdork.db |
--resume |
Continue a previously interrupted batch (re‑runs queries with pending or failed status). |
|
--history |
Display a list of all previously executed queries with their status. | |
--no-dedup |
Disable global URL deduplication (by default, duplicate URLs across queries are skipped). | |
--export-db PATH |
Export the entire database to a JSON or CSV file (format guessed from extension). |
Usage Examples¶
# Resume an interrupted batch
atdork --resume
# View past searches
atdork --history
# Export all stored results to JSON
atdork --export-db all_results.json
# Export to CSV
atdork --export-db all_results.csv
# Disable deduplication (keep every URL even if seen before)
atdork -q "test" --no-dedup
# Use a custom database location
atdork --db-path /secure/path/atdork.db --history
How It Works¶
- A
Databaseobject is created fromcore/database.py. It manages two tables: queries(id, query_text, status, timestamps)-
results(id, query_id, title, href, body, raw_json, created_at) with aUNIQUE(query_id, href)constraint. -
During a batch or single search:
- Each query is inserted/updated with status
pending→completedorfailed. -
Results are inserted one by one; duplicates (same query_id + href) are silently ignored (unless
--no-dedup). -
--resume: -
Reads all queries with status
pendingorfailed, re‑executes them, and updates their status. -
--history: -
Prints all rows from the
queriestable with their status and timestamp. -
--export-db: -
Joins
queriesandresults, then writes either a JSON (dict{query_id: [results]}) or a CSV (flat table with query_text, title, href, body). -
--no-dedup: - When disabled, the
add_result()method returnsTrueonly if a new row was actually inserted (usingINSERT OR IGNOREand checkingcursor.rowcount). The batch runner then counts only truly new results.
This database system makes AtDork suitable for long‑running engagements and professional reporting.