Building a taxonomy for first-time tax filers
When the IRS set out to improve digital services for first-time taxpayers, it became apparent that many filers were struggling to find the necessary information. One reason? The IRS and taxpayers often spoke two very different languages.
To address this, I led a team that performed an inventory of existing taxonomies, conducted user research, and analyzed search data to determine user-friendly terms and labels. The year-long project laid the foundation for a centralized, scalable taxonomy designed to improve content findability and deliver a more intuitive experience for new filers across IRS.gov and related platforms.
Challenge
Business problem: Existing taxonomies were fragmented, org-chart driven, and outdated. They reflected internal logic, not user behavior or natural language.
User problem: Navigation labels were too audience-based, vague, or jargony for first-time filers to understand and find what they needed to complete their taxes. They needed a taxonomy that matched layman’s terms with IRS-speak. Some areas of disconnect included:
- Where to find step-by-step filing instructions: Tax basics? Filing 101?
- What to call independent employees: Gig workers? Freelancers? Part-time employees?
- Filing methods: There was confusion between various ways to file including Free File and Direct File
Strategic challenge: How do you explain and label complicated tax concepts that resonate with people who have never filed before and those who rarely file, which is a significant number of individuals, since most of us only file once a year.
Goals
- Reduce taxpayer frustration and cognitive load
- Improve search and navigation
- Ensure consistency across channels and content types
- Lay the foundation for future metadata and AI integrations
My role
I led the year-long project with team members from across the IRS, including the Office of Online Services and supporting contractors. My role included:
- Project management (Jira, Confluence)
- Scheduling and leading meetings (Microsoft Teams, SharePoint)
- Conducting user research and analytics deep-dives (Figma, Google Search Console, Looker Studio)
- Documenting existing taxonomies in the content management (Drupal, Excel)
- Leading training for content teams on taxonomy basics and best practices (Teams, PowerPoint)
My approach
With more than 30,000+ webpages and multiple applications and chatbots, finding all the existing inventories presented a massive challenge. I used both a top-down and bottom-up approach and worked to focus on the most relevant taxonomies:
Identified taxonomies used in the primary content management system, help desk systems, enterprise search and others.
- Conducted competitive analysis in the tax area, including consulting existing tax-focused taxonomies.
- This information was collected in spreadsheets, screenshots and reports. Building on existing taxonomy work, we developed a taxonomy inventory template to collect information and store it in a SharePoint site.
- Conducted stakeholder interviews and system walkthroughs to find “hidden” taxonomies.
- Incorporated standards from Schema.org, ISO, NISO, W3C, and FIBO
We also reviewed the top search terms for both organic and site search to help determine the terms and labels users were using. From this work, we began building a master thesaurus and list-controlled vocabularies for continued edits.
Throughout the process, I worked to engage stakeholders and gain buy-in on the importance of having a taxonomy, as it would provide a foundation for improving aspects such as search and chatbot responses, ultimately facilitating the use of structured data or metadata. I created and delivered multiple presentations that introduced the basic concepts of a taxonomy to leadership and other key audiences across the Service. One aspect of this was demonstrating how taxonomies would immediately benefit us, enhancing search on the current website with improved faceted filtering, typeahead with synonyms, tags, and subject-filtering options.

A user journey-based taxonomy
Given the tight timeframe, I wanted to focus on delivering a proposed taxonomy for one key user type that had been identified by leadership as a top priority – the first-time filer. I worked with user research teams to incorporate questions about tax terms and labels into user testing sessions. I also attended and took notes during user testing, noting specific terms that caused confusion and the actual names users called things.
I began building a spreadsheet with the steps first-time filers take in the process, along with the relevant key terms associated with each step. From there, I added a list of synonyms and then included the top search terms and queries. From that, I worked with the team and stakeholders to select the preferred label for each term and to identify alternative labels.
The final deliverable was the First-Time Filer Taxonomy spreadsheet, our draft of a controlled vocabulary, and a report outlining next steps for integrating the taxonomy with metadata fields. I also developed draft documentation for taxonomy maintenance and evolution
Results
The inventory and assessment helped us understand the complexity of the IRS taxonomy systems and the scope of a comprehensive taxonomy across the agency. In several cases, the same term had multiple definitions depending on the business unit involved.
By focusing on first-time tax filers, the project laid the groundwork for additional journey-based taxonomies using taxonomy templates and a guiding structure in place.
Analyzing search data and incorporating findings from user research helped establish preferred labels based on terms that taxpayers used, rather than IRS-speak.
That research enabled me to make immediate improvements in content, such as updating the Individual Tax Filing page and navigation to use the preferred terms of first-time filers.

By the Numbers
100+ concepts
250+ broader/narrower relationships
Designed for multilingual deployment
Next steps
I hope our work will build a foundation for dynamic content filters and user segmentation. Ultimately, it could facilitate semantic interoperability for speech-to-text, chatbot intents, and other applications.
It should be a key piece of infrastructure, enabling better search, smarter automation, and more human-centered digital experiences.