Thought Leadership Vaccines Drug Development

AI-Enabled Biologics Data Analysis For All: Interview with baSeq

On-Demand
July 9, 2025
|
09:00 UK Time
|
Event lasts 18m
Mikkel Reincke Fuglsang

Mikkel Reincke Fuglsang

CEO & Co-founder

baSeq

Format: 18 Minute Interview

0:14 

Hello and welcome to this interview for Oxford Global. 

 
0:17 
Today I have the pleasure of being joined by Mikkel Reincke Fuglsang, CEO and Co-Founder of baSeq. 

 
0:27 
baSeq will be joining us at our Cell UK 2025 conference in our startup zone, which will be taking place in November. 

 
0:35 
Mikkel, thank you very much for joining me today. 

 
0:39 
Thank you for having me. 

 
0:41 
It's a pleasure. 

 
0:43 
So first I want to ask you a bit about your company vision. 

 
0:47 
Can you start by telling us a bit about the journey behind baSeq? 

 
0:51 
What motivated you and your co-founders to launch this company? 

 
0:55 
Sure. 

 
0:57 
Well, both me and both of my Co-founders are pretty young and pretty recently graduated from university. 

 
1:07 
And it basically all started while we were studying. 

 
1:10 
We kept hearing that, OK, there's so much data, there's not that many bioinformaticians who can actually analyse this data. 

 
1:16 
And originally we actually thought, OK, let's do some consultancy work where we'll just contact companies and research groups and do some data analysis for them. 

 
1:26 
But then at the same time, LLMs and ChatGPT 3 started emerging and getting into some of the more popular. 

 
1:38 
Or maybe I'll say that ChatGPT was getting more popular around both in science and also at the university and stuff like that. 

 
1:48 
So we thought, OK, maybe can we use this technology in large language models to actually automate the process of doing sequencing data analysis instead? 

 
2:00 
Then we kind of got started working on that and have been working on that main goal ever since. 

 
2:08 
Fantastic. And your mission is to make complex biological data analysis accessible to any cell researcher. What challenges in the current single cell RNA-seq landscape are you currently trying to solve? 

 
2:24 
Yeah, it really still traces back to the bottlenecks in data analysis because demand for analysis is just vastly outpacing what is actually possible to analyse for bioinformaticians currently. 

 
2:43 
And this is just going faster and faster all the time as biomedical sciences are moving more into to a data-driven workflow. 

 
2:51 
Of course, it's all it all, it's always been data-driven, but these huge data sets with single cell sequencing and stuff like that are just taking up a lot of time to properly analyse. 

 
3:02 
So that's the main problem. 

 
3:04 
Then there's a lot of stuff that feeds off of this, which is, among other things, broadening the understanding of what happens in bio bioinformatical workflows, and everything related to that as well. 

 
3:19 
So yeah, a couple things, but mostly it's looking into how can we speed up and maybe resolve some of the bottlenecks in data analysis. 

 
3:30 
Great. And just I wanted to ask briefly on your actual product itself, and you've developed an end to end analysis pipeline specifically for single cell RNA-seq. Could you walk us through some of the key features of that platform and how it differs from traditional tools that are out there? 

 
3:50 
So let's it's a software platform that you log into and upload your data, like saying it very briefly and then you can move through a pipeline that integrates with large language models. 

 
4:07 
And I guess comparing it to traditional tools is kind of it. 

 
4:13 
When I think of traditional tools, I think of either bioinformaticians or learning to programme yourself. 

 
4:21 
And then there's some of the legacy platforms that are out there and there's pros and cons to both of these. 

 
4:26 
For bioinformaticians, it's you have a high degree of freedom because the bioinformaticians really know what they're doing. 

 
4:32 
They're very professional people, but it's also very expensive and very slow. 

 
4:39 
And this is both because it takes a long time to write back and forth with bioinformaticians from you and your lap and then wherever they're sitting, even if it's local. 

 
4:50 
And then compared to the legacy platforms, it's faster and cheaper, but it's also very locked. 

 
4:56 
And this it, the analysis that you can do often end up being a little shallow compared to the amount of insight that you could actually get from working with single cell data. 

 
5:08 
So the ideal solution in our opinion, and what we're working off of is kind of creating a platform that can take the pros of both of these opportunities. 

 
5:20 
So you both have a very accessible way of you working with your data, but you also have the freedom and depth that comes with working with a bioinformatician. 

 
5:32 
And to do this, we use, we're using large language models to turn natural language into intellectual biomedical research and outputs and understandings. 

 
5:47 
And we're very fortunate that we kind of got started when LLMs were getting into popular culture. 

 
5:56 
So we've been able to have this as an integral part of our software right from the beginning instead of having built something and then trying to put a last language model into this. 

 
6:07 
Yeah, I see. And so just going into your use of large language models, how are they embedded into your analysis workflow and what does that unlock for your users? 

 
6:23 
It's we're trying to integrate this into basically every aspect of our software. 

 
6:31 
Of course, there's traditional things like we've chosen to go with a workflow that's set up where you still have steps, but then you have the ability to integrate with a large language model for every one of these steps instead of just having the open chat box. 

 
6:50 
That's a design choice that we made pretty early on because we have the idea that maybe an open chat box, if you don't actually know that much about bioinformatics, then you don't really know what to ask it. 

 
7:05 
So we thought it's better to structure this and then provide the large language model for the user to translate all of their inputs to code and to output. 

 
7:18 
So it's of course integral for the actual analysis that you're doing. 

 
7:25 
We also have a chat function where you can jump back and forth and ask questions about your output and your how to understand what you're looking at and what could next steps be. 

 
7:39 
How could I improve this analysis step? 

 
7:42 
There's also things like our exports and that automatically generates a lot of information about your plots and stuff that's ready to be put into scientific publications. 

 
7:53 
And yeah, so we're trying to use this to make it a more intuitive process to work with this type of data without compromising on depth and efficiency. 

 
8:06 
Fantastic. And could you give us an example of how a researcher might use those AI features you just talked about to customise or interpret their analysis? 

 
8:15 
Sure. 

 
8:15 
I'll try not to go to step by step. 

 
8:19 
I think that could take a long time. 

 
8:21 
But I guess in a normal workflow, you'd contact a bioinformatician and you'd get them to do some initial quality control, maybe some clustering and some cells hype annotation. 

 
8:33 
And then after that, you'd initiate a long exploratory phase where you'd go back and forth both because you might have inputs to what the bioinformatician has produced, but also because you as a researcher most likely needs to figure out what you actually want to do with this data. 

 
8:51 
This is both an aspect of maybe not knowing or having haven't done bioinformatics analysis yourself, so you don't actually know what's hidden within your data. 

 
9:05 
And another thing is just figuring out, OK, so how does this fit into my experimental hypothesis? 

 
9:12 
And they're comparatively in our software. 

 
9:15 
We've tried to kind of speed up this whole process of the back and forth with the bioinformatician. 

 
9:21 
Right now it's not, our idea is not to completely replace bioinformaticians, but it kind of can serve as a tool to both get an understanding of your data yourself and then contact the bioinformatician and have a much more fruitful collaboration with them because you have an understanding of what you actually want. 

 
9:40 
And if you don't have access to a bioinformation, then you can always jump into our software and do a lot of analysis and then get some depth into your data. 

 
9:48 
And so let's say you've done some, you've done your quality control and your clustering within our data or our software platform. 

 
9:58 
And then you want to move into some of the more, some of the analysis steps that has a bit more depth to them. 

 
10:08 
And so you could jump into the chat, for example, and say, OK, based on my clustering plots, which resolution should I work with? 

 
10:17 
And are there any clusters here that might seem more interesting? 

 
10:22 
And then it's going to look at your plot and analyse it and then pull out some information for you. 

 
10:27 
And it might identify a couple of clusters that seem more interesting between your conditions and samples. 

 
10:36 
Then you can run a differential gene expression on those clusters. 

 
10:39 
And then again go back to the chat and say, OK, based on all of the differential expressed genes, are there anyone that seems more interesting? 

 
10:48 
Or just give me a list of the most highly differentially expressed genes. 

 
10:54 
And then you can take that list and put it into the step that visualises gene activity. 

 
11:02 
And immediately also get some visualisations of your gene activity in order to annotate your clusters and figure or look into subtypes and different things with within your cell types and data. 

 
11:18 
And this back and forth workflow all happens within depending on how well you're acquainted with the software and how effectively you work. 

 
11:27 
All of this happens within 5 minutes, whereas usually you draft the e-mail and then you wait a day or a couple days for the bioinformation to run the analysis and get back to you. 

 
11:38 
And then you doesn't look quite the way you wanted it to. 

 
11:41 
And then you have to write another e-mail or have another online meeting. 

 
11:44 
And so this is just a way to kind of speed up all of these processes. 

 
11:49 
And then of course, next to all of the plots that you're getting, you're also getting explanations that's automatically generated explaining the plots and you get the code that's been run to produce the plots and explanations of the code itself. 

 
12:07 
And yeah, and a lot of interaction with the different elements of AI within the software, if that makes sense. 

 
12:15 
Yeah, it does. Fantastic. And also, when we think about AI, one thing that sometimes comes up is it being a black box and transparency and interpretability being part of that. How do you see these concepts of transparency in your long term product vision? 

 
12:39 
Yeah, it's something that I think everyone should find this important, but and we also do. 

 
12:49 
And it's, there's two aspects to this. 

 
12:52 
I think especially within biomedical science, putting data into a black box that then produces some output that you hopefully can trust is not a great feeling, especially not when you're working with patient sensitive data and stuff like that. 

 
13:08 
So I think being transparent about both how data is handled and also providing as much input as possible is a very good thing, especially in yeah, like I said, in biomedical science. 

 
13:22 
And on top of this, we're also working with new technology that people are still learning to trust. 

 
13:29 
And I think even if the model or when it's also probably a way to say the model hallucinates, then you're much more likely to figure it out and understand why it happened if you have as much information about the process that generated the output as possible. 

 
13:47 
So, and of course this is a balancing act because you also don't want to provide so much output and information that it becomes confusing to work with the actual software application. 

 
14:03 
But yeah, I think you need enough information so that people feel like they can trust it and it's it gets somewhat demystified. 

 
14:12 
Yeah. Thank you. So, what's next for basic? And are you planning to expand into other omics types or integrate with lab automation or cloud platforms, anything like that? 

 
14:29 
Yeah, we are. 

 
14:31 
And actually we just a couple days ago announced that we're also including a branch of our software that looks into bulk RNA sequencing. 

 
14:43 
But another thing that we're looking a lot at right now is spatial transcriptomics. 

 
14:49 
It's really a lot happening within that field. 

 
14:50 
And I feel like the extra spatial dimension that comes with that sequencing technology is very useful for research. 

 
15:00 
And we're also looking a bit forward at things like single cell proteomics and stuff like that. 

 
15:06 
But I think that field still has a ways to go, but it'll be nice to be in on it early so that people would have a way to analyse it and hopefully reduce the barrier for entry to get into the technology. 

 
15:23 
That's great, thank you. And just finally, where do you see AI and language models playing the biggest role for bioinformatics, say five years from now? And how is baSeq positioning itself in that future? 

 
15:40 
I think large language models are going to integrate into probably every technical industry imaginable, if it hasn't already, then it's just a matter of time. 

 
15:54 
And, but I also think that it's going to be some time before software can do what humans can do as well as humans can do it. 

 
16:04 
But I definitely feel like in five years, AI, it's going to be an essential tool for bioinformatics workflows. 

 
16:13 
I think today going back to black boxes, bioinformatics itself is almost a big black box where you generate some data that you get sequenced and then you send it off to a bioinformatician and they do something with it. 

 
16:30 
And then you have some visualisations that you can throw into a research paper or something like that. 

 
16:34 
And I think both for the bioinformaticians and for the researchers, it would be helpful to kind of broaden the understanding of how all of this works. 

 
16:44 
So I think AI is going to be a key feature in kind of broadening the and demystifying the bioinformatical aspect of biomedical research. 

 
16:58 
And, but at the same time, I mean, who knows how much is going to happen. 

 
17:05 
Already our own software is able to do stuff that we everyone at baSeq thought was much further away. 

 
17:13 
So it could explode and go completely wild. 

 
17:16 
I don't think AGI is achievable in five years, but I definitely think that it's going to be an integral tool in bioinformatics. 

 
17:27 
That's great. Thank you. And thank you so much for joining me, Mikkel. 

 
17:32 
You can see Mikkel at our Cell UK 2025 conference, which is coming up in November. 

 
17:40 
He'll be at the start up zone. 

 
17:42 
So if you want to learn more about baSeq, you can come and catch the company there. 

 
17:48 
Thank you once again for joining me. 

 
17:51 
Yeah, but thank you so much, Tom. 

 
17:52 
And I'll also just say that we will set up a booth where you can try out the software if you want to, if you want to test it out for yourself. 

 
17:59 
And I look forward to the conference. 

 
18:02 
It's going to be a blast. 

 
18:03 
Looking forward to it too. 

 
18:04 
Thank you very much. 

 
18:06 
Thank you.