April 18, 20268 min read

AI Features in the Ultra Plan: Summaries, Minutes & Social Posts with a Local Open-Source LLM

A transcript is rarely the end of the work. After the meeting comes the minutes. After the interview, the article. After the workshop, the task list. And after the podcast, the social media post that points to it. These are all steps that lend themselves perfectly to automation with a language model – provided you are willing to hand the finished transcript over to an external service for that purpose.

That is exactly the point at which most privacy-conscious users bow out. Because a transcript that was previously processed using client-side encryption should not be sent off to OpenAI, Anthropic or Google in the next step. That is why scryp offers AI analysis differently: it runs on the same GPU workers as the transcription, in the same sealed-off environment, using a local open-source model. This feature is available in the Ultra plan.

The difference: a local LLM instead of a cloud API

AI analysis in scryp runs on a local open-source language model executed on our own GPU workers in the EU. There is no call to OpenAI, Anthropic, Google or any other external AI service. The model has no internet access and is technically unable to send data to the outside world.

Most AI features in transcription tools work according to a simple pattern: the finished transcript is handed off to an API – usually to a US provider such as OpenAI or Anthropic – and the response is sent back to the user. That is quick to implement, but in data-protection terms it is a breach: the data you just carefully encrypted ends up as plaintext at a third-party company outside the EU.

scryp takes a different approach. For AI analysis we use an open-source language model that we host ourselves. It runs on the same GPU workers that also perform the transcription, in the same EU data centers at Hetzner. In concrete terms, that means:

No external API call. Neither OpenAI nor Anthropic nor Google sees your transcript. There is no contract with a US LLM provider, because technically there does not need to be one.
No internet access for the model. The container in which the language model runs has only an inbound connection to our internal job queue – not the reverse. The model cannot send any requests to the outside world.
No model training on your data. The open-source model is static. We do not adapt it to user data, we do not collect prompts, and there is no feedback loop that feeds your content back into the model.
Same legal system as the transcription. Processing takes place in the EU, on the infrastructure of a European host. No third-country transfers, no CLOUD Act.

The same architecture as the transcription

AI analysis uses not only the same hardware as the actual transcription, but also the same security model. Results are immediately re-encrypted with your key before they leave the worker environment. At no point do they sit in plain text on disk or in the database – exactly as with transcripts and audio files.

AI analysis is therefore not a new question of trust: it runs within the same isolated environment that we already use for the transcription itself.

Five types of analysis in the Ultra plan

The Ultra plan currently offers five types of analysis. Each is optimized for a concrete use case, rather than offering a general “chat” feature. The result is a specific, ready-to-use document – not a generic AI output that you still have to process further.

1. Summary

The summary condenses a transcript down to its essential statements. It is particularly well suited to long recordings – hour-long interviews, two-hour workshops, podcast episodes – where you need a quick overview without reading the entire transcript.

You can set the length of the summary using a slider between 100 and 2,000 characters. Short for an overview in an email, longer for a detailed management summary. The model automatically adapts the density of the text to the desired length.

2. Meeting minutes

The minutes structure a meeting transcript into the classic sections: participants, topics, discussion points, decisions and open items. Unlike the summary, which condenses the content in a narrative way, the minutes follow a fixed structure – just as you would expect from regular meeting minutes.

This feature is well suited to meeting minutes, recurring jour-fixe documentation, project reviews and anywhere a traceable, structured record is needed. The result is a document that you can use directly with minor adjustments.

3. Task list

The task list extracts all to-dos from a meeting – together with the responsible person and, where mentioned, the planned deadline. The model specifically searches for sentences like “Lisa will take that on,” “By Friday we need to…” or “Daniel, can you look after that?” and turns them into clear action items.

This saves a step that, in many teams, nobody likes to take on: tracking open items. Instead of manually combing through the transcript for commitments, you get a ready-made list that you can transfer into your project management tool.

4. Social media post

From a transcript you can generate a social media post for LinkedIn, Facebook or X (Twitter). The model chooses the length, tone and structure appropriate for the platform. LinkedIn posts are typically longer, more substantial and more professionally worded. X posts are short, pointed and built around a clear hook. Facebook sits in between, with a more personal tone.

This is particularly useful when you are going to publish a recording later anyway – a talk, a podcast interview, a keynote. Instead of writing the post manually from memory, the model delivers a finished draft based on sentences that were actually said.

5. Article

The longest and most compute-intensive type of analysis: a structured article with an introduction, subheadings, key points and a conclusion – generated from the transcript of a recording. The target character count can range between 1,000 and 10,000 characters.

The article mode works internally in several steps: the model first plans the outline, then writes the individual sections and finally assembles them into a coherent text. The result is not a “padded transcript,” but a standalone article that presents the statements of the recording in journalistic or editorial form.

Typical applications: blog posts from interviews or podcasts, feature articles from specialist talks, reports from conference recordings. Manual editing remains necessary – but the starting point is considerably further along than a raw transcript.

What the open-source model can do – and where its limits lie

We deliberately do not name the model we use. The reason: the AI landscape is evolving so quickly that the specific choice may already be a different one in six months. What matters is the principle: open source, run locally, no external service, EU processing. We are committed to these properties, even if we eventually swap the model for a more modern one.

What you should know nonetheless: local open-source models of the size that runs on a single GPU are not at the same absolute quality level as the largest cloud models (GPT-4 class, Claude Opus). For the typical analysis tasks – summarizing, structuring, rephrasing – they are, however, very well suited. The quality gap has also narrowed considerably over the past 18 months, while the data-protection advantage of a local model remains constant.

In practice this means: if you want a freely worded, creative text that sounds as if it were written by an experienced author, you should regard the AI output as a starting point, not as a finished text. For structured tasks like minutes, task lists or summaries, the result is regularly ready to use straight away.

What the result stores – and what it does not

The result of an AI analysis is stored encrypted, just like the underlying transcript. No one outside your account can read it – neither our staff nor anyone who might gain physical access to the database. The prompts we use internally to steer the model are also not logged together with your content.

What we do store: that a job of a particular type (summary, minutes, …) ran, when it ran and how long the processing took. We need this metadata for billing, error diagnostics and capacity planning. The content of your transcripts or the AI results is not included in these logs.

Why AI analysis is only available in the Ultra plan

AI analysis runs on the same GPUs as the transcription. A single article job can occupy a GPU for several minutes; minutes or a summary take considerably less time. This is more expensive than an API call at a cloud provider, but we do not hand the hardware over to third parties.

That is why the feature is included in the Ultra plan. Anyone who wants to use it actively gets the hardware capacity required for it – at a predictable monthly price, rather than being billed by tokens. You can find all the details about the plans on the pricing page.

Summary

Local open-source model: Runs on scryp’s own GPU workers in the EU. No external API call, no US LLM provider.
Isolated environment: The model has no internet access. It cannot send data to the outside world, because it technically has no connection for it.
Same security model as the transcription: Results are stored encrypted, and processing runs in the same isolated environment.
No training on user data: The model is static. Your content does not flow back into the model.
Five types of analysis: Summary, minutes, task list, social media post (LinkedIn/Facebook/X) and article – each optimized for a concrete use case.
Ultra plan: AI analysis is part of the Ultra plan, because the GPU time it requires ties up real hardware capacity.

Conclusion

An AI feature that then sends your carefully encrypted transcript off to an external cloud service makes little sense from a data-protection standpoint. That is why scryp does not do it. The summary, the minutes, the task list, the social media post and the article are created on the same worker system that also performs the transcription – in the EU, on our own GPU infrastructure, with a local open-source model that has no window to the outside world. It is slower and more expensive than an API call to OpenAI – but it is the only variant that fits the rest of the architecture.