← Back to Journal

Uncategorized

Writing Community, Listen Up: Google and Microsoft Are Training AI on Your Unpublished Work

By Taper.ink 5 min read

When an author submits a query to a literary agent, they’re trusting that agent with something deeply personal — months or years of work, a story they haven’t shared with anyone, pages they’ve revised a hundred times.

Most of the time, that query lands in Gmail.

And that’s where the trust gets complicated.


What Google Actually Does With Your Inbox

This is not speculation. From Google’s Privacy Policy, in their own words:

“We also collect the content you create, upload, or receive from others when using our services. This includes things like email you write and receive, photos and videos you save, docs and spreadsheets you create.”

That’s Gmail and Google Docs, explicitly named. And what does Google do with that collected content? Also from their own policy:

“We use artificial intelligence and machine learning… As part of this continual improvement…”

In 2024, Google took it further. They embedded Gemini — their AI product — directly into Gmail and Google Docs as “Gemini in Workspace.” Gemini in Workspace is explicitly designed to read your emails and documents to generate responses, suggest edits, and summarize content. Google isn’t just storing your data anymore. They’ve built an AI product whose entire function is to read it.

Every query letter an agent receives in Gmail. Every synopsis. Every cover letter an author agonized over. All of it is content Google collects — and all of it is now inside the same infrastructure powering their AI.


Microsoft Built the Same Machine, and Named It

Microsoft 365 Copilot is less subtle. From their own documentation:

“Microsoft 365 Copilot accesses content and context through Microsoft Graph. It can generate responses anchored in your organizational data, such as user documents, emails, calendar, chats, meetings, and contacts.”

If you’re an agent using Outlook or Word, that is your query inbox. Microsoft built an AI product whose explicit purpose is to read it.

Microsoft says Copilot data isn’t used to train their foundational models — for now. But that distinction matters less than it sounds. Whether they call it “training,” “improving,” or “powering responses,” the result is the same: an AI system is reading your authors’ unpublished work. Work they never agreed to share with a machine.


The Industry Already Knows What Happens When AI Gets Access to Creative Work

In January 2026, the Authors Guild announced the terms of a $1.5 billion settlement with Anthropic after the AI company was found to have illegally downloaded and trained on copyrighted books. Authors received approximately $3,000 per book — for work that powered one of the most commercially successful AI systems in the world.

The same month, the Authors Guild formally raised concerns about Amazon Kindle’s “Ask This Book” AI feature — which uses book content to answer reader queries without author consent.

The pattern is the same every time: a platform gains access to creative work, uses it to build AI capability, and authors find out afterward.


The Tools Agents Are Using Were Built Before Any of This Existed

QueryManager launched in 2009. Submittable has been around since 2010. Neither was designed with AI training pipelines in mind, because AI training pipelines didn’t exist at scale yet.

When an agent uses a 15-year-old tool that syncs with Gmail, they aren’t just using old software. They’re routing sensitive, unpublished creative IP through an infrastructure that Google has since turned into an AI engine.

The authors who queried through those inboxes never consented to that. Neither did the agents.


If You’re Using These Tools, You Are the Product

Google’s business model has never been selling software. It’s selling understanding — of what you read, what you write, who you talk to, and what you’re working on. Gmail is free because your data pays for it.

Microsoft 365 sells software, but Copilot’s value proposition is access — to your calendar, your inbox, your documents. You pay for the subscription. You pay again with your content.

Literary agents who use these platforms for queries are exchanging their authors’ unpublished work for free or discounted software. That trade might be fine for a grocery list. It is not fine for an author’s first novel.


What Taper Does Differently

Taper was built in 2025, with all of this already in the room.

We don’t integrate with Gmail. We don’t sync with Outlook. We don’t pass submissions through Microsoft Graph or Google’s improvement pipeline.

When an author submits a query through Taper, that content stays in Taper. It is never processed by a third-party AI system. It is never used to train models — ours or anyone else’s. No language model will ever read an unpublished manuscript because it passed through our platform.

That’s not a feature. It’s a design principle we decided on before we wrote the first line of code.


The Query Process Is the Most Vulnerable Moment in a Writer’s Career

A query letter contains an unpublished manuscript — work with no market visibility, no legal registration, and no advocate until an agent says yes.

That moment deserves better than infrastructure built to extract value from everything that flows through it.

We built Taper because we believe it does.


Taper is a query management platform built for literary agents and authors. We’re in pre-release, hand-picking early access agents now. If you manage a query inbox and you care about where your authors’ data lives — we’d like to talk.

Taper.ink