“A single leak can illuminate the machinery of oppression that thrives in darkness, offering the public a glimpse of truths carefully buried.”
Early in 2024, I heard programmer and former infosec specialist and investigative journalist at The Intercept, Micah Lee, interviewed on the podcast Firewalls Don’t Stop Dragons. Lee was discussing his soon-to-be-released book, Hacks, Leaks, and Revelations: The Art of Analyzing Hacked and Leaked Data. Already a fan of his work (Lee is among the rare few who have seen all the Snowden leaks), I was immediately drawn to the project and keen to purchase the book. Between wanting to support my local independent bookstore and the publishers’ shipping process, though, it took some time for the physical copy to reach me in Australia.
When it finally arrived, I was in the midst of study goals and work-training-life commitments. Light reading and tiny dips into the book were all I could manage until December, when I could fully dive into Project HLR.
What struck me first was the book’s purpose: Lee aims to empower journalists, their sources, and activists by equipping them with the technological knowledge, tools, and confidence to protect themselves and anyone they may interact with. He provides guidance on validating data, applying critical thinking, and leveraging OSINT skills and uncovering hidden truths – including lessons on using tools to assess claims made by hackers.
Lee uses real-world examples and datasets, including BlueLeaks and Oath Keepers, to make his lessons tangible. Learning how to safely access datasets such as these is thrilling. Through tools like Tor Browser, OnionShare, Signal, and SecureDrop, Lee illustrates how large datasets can be securely shared and accessed. His firsthand tips are incredibly helpful, walking through every step, from minimising digital trails to secure communication. One of the most fascinating aspects is learning to use Python and SQL to analyse datasets sourced from sites like DDOS and cross-referencing data with OSINT.
For me, this deep dive into research-driven work is absolutely fascinating. I love learning to leverage tools with real-world examples, and exploring datasets is just as exciting. The datasets themselves are as varied as law enforcement records, extremist group data, and ransomware gang information. Although I’m not engaging in hacking or any illegal activity, I’m learning how to access, analyse, and organise data responsibly. The next step will be extracting meaningful insights to write reports (which I cannot wait to begin work on!)
So far, Project HLR has reminded me how much I adore deep work and enjoy investigative and forensic-focused tasks. It has also helped clarify the path I want to pursue in 2025: Digital Forensics. Tracing digital trails, data analysis and authentication, secure handling of evidence, ethical and legal considerations, and OSINT applications in cybersecurity breaches and responses are exactly the kind of challenges that ignite my curiosity and drive. Ultimately, I’d love to return to where my career began—in the not-for-profit sector—and digital forensics feels like the right path in cybersecurity to combine my experience with new skills to make a meaningful impact. Even halfway through, Project HLR has delivered a personal revelation, and already made my summer brighter.
For anyone interested in working with datasets or exploring leaked content, I cannot recommend Lee’s book highly enough. It tempted me on my desk for months before I could dive in, but now that I have, I can’t fault it. However, it’s not for the faint-hearted—there is confronting content in the datasets, which may be triggering for some, and depending on your cybersecurity or IT journey, the skills involved can present steep learning curves. That said, this space where challenge meets growth is where I thrive. It’s where I feel most alive and in flow state.
As I work through the second half of the book, I plan to write reports on various datasets, exploring areas of intrigue and learning how to integrate these skills into my cybersecurity tool belt moving forward.
Interesting appendix:
Lee recently released a tool, Cyd, to help people delete their data from X (formerly Twitter). After being suspended from the platform in December 2022 for sharing a link that tracked Elon Musk’s private jet movements, and other posts which apparently offended Musk, Lee developed the tool to assist others with similar privacy concerns.
Earlier this year, Lee was among several journalists affected by job cuts at The Intercept and was let go as part of these broader reductions. If you’re able to support his important work, I encourage you to do so. However, in the interest of empowering more people to work with large datasets, he has generously made the book open source. Personally, I love supporting authors doing important work, and adore the feel of a new book and the act of jotting notes in my Moleskine before typing up digital summaries however if you prefer the digital route, or a combination of both as I do, Lee has also made all the code available on GitHub.

Leave a comment