Database & History¶
Introduction¶
AtDork can persistently store all queries and their results in a SQLite database. This enables resuming interrupted batches, viewing search history, deduplication across sessions, and exporting everything to JSON/CSV for reporting and analysis.
Functions¶
| Flag | Description | Default |
|---|---|---|
--db-path PATH |
Database file path. | atdork.db |
--resume |
Continue a previously interrupted batch (re‑runs queries with pending or failed status). |
|
--history |
Display a list of all previously executed queries with their status. | |
--no-dedup |
Disable global URL deduplication. By default, duplicate URLs across queries are skipped. | |
--export-db PATH |
Export the entire database to a JSON or CSV file (format is guessed from the file extension). |
Usage Examples¶
# Resume an interrupted batch
atdork --resume
# View past searches
atdork --history
# Export all stored results to JSON
atdork --export-db all_results.json
# Export to CSV
atdork --export-db all_results.csv
# Disable deduplication (keep every URL even if seen before)
atdork -q "test" --no-dedup
# Use a custom database location and view history
atdork --db-path /secure/path/atdork.db --history
How It Works¶
- Database Structure
TheDatabaseclass (fromcore/database.py) creates two tables: queries– stores each unique query text, its status (pending,running,completed,failed), and timestamps.-
results– stores individual search results (title, URL, snippet, and raw JSON). AUNIQUE(query_id, href)constraint prevents duplicate URLs for the same query. -
During a Search
- Every query is inserted (or updated) with status
pending, then set tocompleted(orfailed) after execution. - Each result is inserted individually. If the same URL already exists for that query, the row is ignored – this is the built‑in deduplication.
-
Use
--no-dedupto force all results to be stored, even if they are duplicates. -
Resuming a Batch
--resumereads all queries with statuspendingorfailedfrom the database.- Those queries are re‑executed, and their status is updated accordingly. Completed queries are skipped.
-
This allows you to recover from a crash or an intentional pause without repeating work.
-
Search History
-
--historysimply prints a table of all queries stored in the database, along with their status and timestamps. It gives you an overview of past activity. -
Exporting Data
--export-db <file>joins thequeriesandresultstables and writes:- A JSON file containing a dictionary
{query_text: [results]}, or - A CSV file with columns
query_text, title, href, body.
- A JSON file containing a dictionary
-
The format is auto‑detected from the file extension.
-
Thread Safety
The database uses SQLite’s WAL (Write‑Ahead Logging) mode and proper locking, so it is safe for concurrent access during multi‑threaded batch processing.
This persistent storage makes AtDork suitable for long‑running engagements, audit trails, and professional reporting.