NeLI Pod
NeLI Pod
Podcast Description
The official podcast of the National eDiscovery Leadership Institute. Here, we bridge the gap between technology and the law, bringing you the forefront of electronic discovery.
Podcast Insights
Content Themes
The podcast focuses on themes such as technology in legal practice, eDiscovery best practices, the adoption of AI in law, and ethical considerations in digital evidence handling. Examples of specific episodes include discussions on AI's role in the judiciary, challenges of hyperlinked documents in eDiscovery, and the evolution of technology-assisted review methodologies.

The official podcast of the National eDiscovery Leadership Institute. Here, we bridge the gap between technology and the law, bringing you the forefront of electronic discovery.
Guest: Dr. Jeremy Pickens
Managing Director of Applied Science, Elevate
Episode Overview
In this masterclass‑level conversation, Dr. Jeremy Pickens—one of the most respected information retrieval scientists in e‑discovery—joins Daniel and Brandon to explore the intellectual foundations and future trajectory of search, relevance, and AI in legal practice. Jeremy’s work has shaped the evolution from keyword search to TAR 1.0, to continuous active learning (CAL), and now to the GenAI era. If you’ve used active learning in any modern review platform, you’ve likely benefited from his research.
The discussion ranges from polyphonic music retrieval to tokenization, from ancient Greek philosophy to the cold‑start problem in TAR, and from contextual diversity to the challenges of evaluating AI systems. Jeremy brings a rare blend of deep technical rigor and practical sensibility, offering a perspective that helps legal professionals understand not just what works, but why it works.
Key Takeaways
- Patterns matter more than keywords. Jeremy’s early work in polyphonic music retrieval mirrors the complexity of legal documents—both require identifying structural patterns, not just surface‑level signals.
(“Finding those connections historically over time… is very similar to the storytelling and pattern finding we want to do in e‑discovery.”) - Feature extraction is as important as the algorithm. Tokenization, stemming, and sub‑word representations can make or break a machine learning model’s ability to recognize meaning across documents.
- Outcome‑driven evaluation beats checkbox shopping. Lawyers should focus on how well a system performs on real data—not on whether it claims to use a particular algorithm or technique.
Action Items for Legal Teams
- Evaluate platforms using simulations, not demos. Ask vendors to run your data through their system to measure recall, precision, and learning speed.
- Understand the basics of tokenization. Even a high‑level grasp helps practitioners make better decisions about search and review workflows.
- Adopt CAL for early signal exploitation. Even a single coded document provides useful information—there’s no need for massive seed sets.
Chapters & Timecodes
00:00 – Introduction
Daniel and Brandon introduce Dr. Jeremy Pickens and his impact on the field.
00:03:04 – Jeremy’s Philosophy: Being “Part of the Flow”
Why ideas in e‑discovery evolve collectively, not individually.
00:04:55 – From Polyphonic Music to Legal Documents
How musical pattern analysis informed Jeremy’s approach to information retrieval.
00:08:34 – Short Messages, Semantic Boundaries, and IR Challenges
Why Slack, Teams, and SMS require smarter segmentation techniques.
00:11:03 – Feature Extraction 101
Tokenization, stemming, n‑grams, and why they matter for TAR.
00:14:53 – Sub‑Word Tokenization and OCR
How character‑level patterns help overcome noisy text.
00:17:48 – What Practitioners Should Ask Vendors
Why checklists fail—and what outcome‑driven evaluation looks like.
00:20:34 – The Importance of Frequent Model Updates
How recalculating rankings every two minutes improved precision by up to 20%.
00:22:40 – Why Simulations Are the Missing Piece
Jeremy explains why the industry needs better evaluation frameworks.
00:24:40 – Contextual Diversity: Finding What You Don’t Know
How algorithms identify unexplored pockets of documents.
00:28:56 – Solving the Cold‑Start Problem
Why CAL can begin learning from the very first document.
00:30:01 – Greek Philosophy and TAR
Parmenides vs. Heraclitus as a metaphor for TAR 1.0 vs. TAR 2.0.
Compelling Quote
“You don’t know what you don’t know… and the machine can look globally across the entire collection to find what you’ve never seen before.”

Disclaimer
This podcast’s information is provided for general reference and was obtained from publicly accessible sources. The Podcast Collaborative neither produces nor verifies the content, accuracy, or suitability of this podcast. Views and opinions belong solely to the podcast creators and guests.
For a complete disclaimer, please see our Full Disclaimer on the archive page. The Podcast Collaborative bears no responsibility for the podcast’s themes, language, or overall content. Listener discretion is advised. Read our Terms of Use and Privacy Policy for more details.