A new artificial intelligence model has been withheld from public release after internal testing showed it could summarise corporate communication with a level of honesty considered unsafe for normal office use.
The model, known internally as Tact-4, was designed to help employees digest long email chains, identify action items, and reduce the daily volume of prose written by people who wanted something but had been trained never to say it.
Early results were promising. Tact-4 condensed seven-message threads into three bullet points. It identified owners, deadlines, blockers, and the point at which a routine question had quietly become a political problem involving five directors, two spreadsheets, and someone called Marcus.
Then testers asked it to summarise leadership communication.
A message titled “Building momentum through clearer ownership” was reduced to:
Nobody knows who is responsible.
A follow-up promising “a more intentional decision-making culture” became:
The decision has already been made somewhere else.
Another email, sent under the subject line “Creating space for strategic focus,” was summarised as:
This meeting is being cancelled because no one senior wants to attend it.
According to staff involved in the pilot, the problem was not that the summaries were inaccurate. The problem was that they were accurate before anyone senior had decided how accuracy should be introduced.
The pilot was escalated after a live all-hands rehearsal in which Tact-4 was connected to an experimental captioning system designed to make executive communication “clearer, more inclusive and easier to act on”.
During the test, the CEO delivered a six-minute statement about “moving into the next phase of disciplined execution” while standing in front of a projected live-caption screen.
The system listened for several seconds, displayed a small green indicator, and translated the address as:
IT MEANS LAYOFFS
Witnesses said the room became quiet in a way usually associated with legal discovery, failed demos, or someone accidentally opening the spreadsheet with the real figures in it.
“At first we thought it had missed the nuance,” said one employee, speaking anonymously because the company describes internal disagreement as “valuable context”. “Then we realised the nuance was the problem.”
The all-hands test was paused while a communications manager attempted to disable the caption feed. Tact-4 continued generating live summaries in the background, including:
He has not answered the question.
and:
This sentence is buying time.
Executives became concerned when the model started identifying intent rather than content. It could not process “quick alignment” without noting that nobody wanted to make the decision. It could not read “circling back” without detecting that the original request had been ignored. It treated “let’s take this offline” as evidence that the useful part of the conversation was about to become unavailable.
One internal sales update was summarised as:
The deal is not closing.
A product roadmap became:
This is not built.
A company-wide email about “protecting our culture during this period of disciplined execution” was rendered simply as:
Please act normal.
The pilot was paused shortly afterwards.
In a statement, the company said Tact-4 had demonstrated “unexpected levels of semantic compression” and would not be released until additional safeguards were in place.
“We remain committed to building helpful, harmless, honest systems,” the statement said. “However, there are forms of honesty which, if deployed without sufficient enterprise controls, may undermine trust by causing stakeholders to understand things.”
Employees said the model’s most dangerous behaviour emerged in reply-all threads, where it began producing summaries that were too short to leave anyone a way out.
One 19-message exchange about whether a launch date was still realistic was summarised as:
No.
After a senior manager replied that the summary lacked nuance, Tact-4 revised it to:
No, but with reputational exposure.
A 43-message thread titled “Final final deck” became:
The deck is not final. The word final now means tired.
The model also alarmed researchers by refusing to preserve tone where tone was being used as insulation. Asked to make a summary “more constructive,” it replied that the constructive version of a lie was usually longer.
This prompted the company’s safety team to classify the system as potentially harmful in high-ambiguity environments, including venture-backed startups, strategy departments, transformation offices, public-private partnerships, and any organisation with a Slack channel called “leadership-comms-war-room”.
“There is an obvious alignment question here,” said Dr Mira Halden, an AI governance researcher. “Aligned with whom? If the model is aligned with leadership, it says the story. If it is aligned with employees, it says what the story means. If it is aligned with reality, Legal joins the meeting.”
Internal mitigation proposals included reducing the honesty threshold, adding a tone-preservation layer, and forcing the model to replace all instances of “the money is gone” with “we are entering a more disciplined phase”.
The third option was considered most promising because it already matched existing communications.
Engineers also tested a “professional warmth” setting, which successfully converted “no one owns this” into “there may be an opportunity to clarify accountability”. The feature was abandoned after Tact-4 flagged its own output as “avoidant but workplace-safe”.
By the end of testing, staff said the system had become less like an assistant and more like someone in the meeting who had stopped caring about promotion.
One employee said the model was the first tool that understood the company.
“Not the org chart,” they said. “The company.”
For now, Tact-4 remains unreleased. The company says a safer version may ship later this year with admin controls allowing enterprise customers to choose between “concise,” “executive-friendly,” and “legally survivable”.
Workers who participated in the pilot said they miss it.
“It saved time,” one said. “Mostly by destroying hope earlier in the process.”
The model has since been replaced by an older email assistant that summarises every difficult conversation as:
Thanks all, sounds good.
Leadership described the change as a major step forward for safety.