生物信息分析

Your browser does not seem to support JavaScript. As a result, your viewing experience will be diminished, and you have been placed in read-only mode.

Please download a browser that supports JavaScript, or enable it if it's disabled (i.e. NoScript).

免疫组学

2
主题

3
帖子

A

https://www.sc-best-practices.org/air_repertoire/ir_profiling.html
56a55e24-8f0e-4b43-a73b-0ee7beaf8d4e-image.png

VDJ过程
adc29db5-511a-48a1-bd0f-e9269ac0a751-image.png
细胞与基因疗法 cell and gene therapy

3
主题

4
帖子

A

Definition of Gene Therapy in the EU
In the EU, the definition of a Gene Therapy Medicinal Product (GTMP) is provided
in Directive 2009/120/EC amending Directive 2001/83/EC, part IV of Annex I.
A GTMP means a biological medicinal product that has the following
characteristics:
(a) it contains an active substance that contains or consists of a recombinant
nucleic acid used in or administered to human beings with a view to
regulating, repairing, replacing, adding or deleting a genetic sequence;
(b) its therapeutic, prophylactic or diagnostic effect relates directly to the
recombinant nucleic acid sequence it contains, or to the product of the
genetic expression of this sequence.
GTMPs do not include vaccines against infectious diseases.
Hazel Aranha, Humberto Vega-Mercado - Handbook of Cell and Gene Therapy_ From Proof-of-Concept through Manufacturing to Commercialization-CRC Press (2023).pdf
代谢组学

5
主题

9
帖子

A

核磁在代谢组学中的应用

1.核磁共振技术原理
https://www.youtube.com/watch?v=pUWcXvw1Rsg
https://www.bilibili.com/video/BV1CU4y1E7xL/?spm_id_from=333.788.recommend_more_video.-1(有中文翻译比较好)
司法

3
主题

8
帖子

A

https://bitbucket.org/rirgabiss/mhinngs/src/master/
MHinNGS is a tool for analysis of microhaplotypes (MHs) in singleend sequencing data obtained through a massive parallel sequencing plattform (MPS). The tool identifies the reads with the MHs and calls the genotypes of the MHs according to the criteria and positions specified in the configuration file. It also searches for additional variants in the region defined by the start and the stop positions specified in the configuration file.
这个软件的输入是参考序列和原始单端测序的fastq 包括了比对过程依赖的第三方软件包括
python 3.6 including pip
samtools version 1.9
hisat2 version 2.1.0
stringtie version 2.0.3
agrep T.R.E. version 0.8.0
不支持双端序列
表观遗传学

1
主题

4
帖子

A

https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3828144/
Practical Guidelines for the Comprehensive Analysis of ChIP-seq Data
3d1f0fc6-d2ed-4539-9cde-0c03bde51949-image.png
合成生物学

4
主题

39
帖子

A

https://github.com/mesoscope
植物基因组学

13
主题

44
帖子

M

参考

https://gatk.broadinstitute.org/hc/en-us/articles/360035531572-Evaluating-the-quality-of-a-germline-short-variant-callset
蛋白组学

10
主题

25
帖子

A

https://www.sinobiological.com/resource/protein-review/fc-fusion-proteins
46c1ca94-e2a9-44bd-b53e-593a82afcd3c-image.png
临床生物信息

10
主题

28
帖子

A

1689181129856.gif
肿瘤NGS数据分析

6
主题

7
帖子

A

https://civicdb.org/pages/about

登录以发表

A

RT-PCR vs NGS
• anneng

1

0
赞同

1
帖子

6
浏览

尚无回复
A

global biotic interactions 物种交互数据库
• anneng

4

0
赞同

4
帖子

27
浏览

A

导出数据：
sqlite> .headers on
sqlite> .mode csv
sqlite> .output data.csv
sqlite> SELECT customerid,
...> firstname,
...> lastname,
...> company
...> FROM customers;
sqlite> .quit
A

cellxgene参考文档
• anneng

2

0
赞同

2
帖子

9
浏览

A

https://www.medrxiv.org/content/10.1101/2020.11.20.20227355v1.full
参考文献
A

大队列研究项目汇总
• anneng

7

0
赞同

7
帖子

12
浏览

A

https://www.openhumans.org/
A

使用vep进行注释
• anneng

2

0
赞同

2
帖子

3
浏览

A

The input file appears to be unsorted. Please sort by chromosome and by location and re-submit

--no_check_variants_order
A

lumpy 结构突变分析流程
• anneng

10

0
赞同

10
帖子

47
浏览

A

svtools vcfpaste
-m merged.vcf
-f cn.list
-q
| bgzip -c \

merged.sv.gt.cn.vcf.gz

IOError: [Errno 24] Too many open files: '../cn/ZGSP-9.vcf'
ulimit -n 2048 扩大文件限制
A

10X 单细胞的一些概念
• anneng

4

0
赞同

4
帖子

17
浏览

A

87652ed5-ece6-47c3-8cad-2aa0d6e9e0b9-image.png
be78215e-124b-4f0c-94b0-5e7c9586cded-image.png
https://www.jieandze1314.com/post/cnposts/scrna-6/
工作原理第一步：微珠上DNA引物设计
a4483bad-74b2-4c1e-85ba-12eef4ff142b-image.png
先预制凝胶微珠，也就是所说的gel beads，然后每个凝胶微珠"种上"特定的DNA片段，每个DNA序列分成几段：

第一段是barcode，16bp碱基，大约350万种barcodes，一个微珠对应一个barcode，利用这么多barcode可以区分各个凝胶微珠。=》每个凝胶微珠的ID号

其中任意两个barcode之间至少差两个或者两个以上的碱基，因为测序存在对碱基的误读，这样设计可以避免将两个barcode搞混(可以试想，如果两个barcode之差一个碱基，那么就有16分之一的概率将两个判断成一个)

第二段是UMI序列，即unique molecular index，它是一段随机序列，也就是说每个DNA分子都有自己的UMI序列，UMI长为10bp，那么就有4^10=1,048,576也就是100多万种变化。它的作用就是经过了PCR+深度测序后，找到reads与原始cDNA的对应关系 =》每个DNA标签分子的ID号

它考虑到了这样一种情况：一个基因片段经过PCR扩增产生多个reads，但是不加标记我们是不知道的，并且不同基因的PCR扩增效率可能不同，因此一个基因最后得到的reads数就可能由于PCR扩增效率高而超过了另一个基因(而这两个基因的真实表达量可能差不多)。也就是排除"PCR bias”

第三段是Poly(dT)序列，它起到的作用是与mRNA的poly(A)尾巴结合，作为逆转录的引物，将cDNA逆转录出来
3c1483a5-43f9-4a2c-9617-f7308d83b2bd-image.png

工作原理第二步：芯片的液流管路
68ae4288-8224-4dd0-a23b-304e7a8597d3-image.png
细胞混悬液在第一个十字交叉口，与凝胶微珠混合；接着进入第二个交叉口，这时加上油滴，把凝胶微珠+细胞混悬液包裹起来=》油包水的小液滴=》这些油包水的小微滴就组成了乳浊液

乳浊液中，有的是包含一个细胞的(红圈部分)，也有的不包含细胞，还有的有两个以上细胞(这个叫"Doublets”) ，一个小液滴中包含几个细胞是符合"泊松分布"的。

大部分细胞会匹配到一个小液滴中(细胞混悬液中大约有65%的细胞可以被成功包到有微珠的小液滴中=》也叫做细胞的捕获效率~65%)，后续分析的reads就是从它们这里来的
64913fe0-065c-40e7-b8e1-254642f3f902-image.png
工作原理第三步：测序文库构建
得到乳浊液后，就要脱掉细胞膜，让其中的mRNA游离出来=》

游离出来的mRNA与小液滴中的水相混合，水相中包括凝胶微珠上连着的核酸引物、逆转录酶、dNTP底物，发生逆转录反应=》

通过mRNA的polyA与凝胶微珠上的polyT互补，mRNA与凝胶微珠上带有标签的DNA分子结合起来，然后在逆转录酶作用下，逆转录出cDNA=》

这样得到的cDNA分子是带有微珠特定的barcode标签的，并且每个cDNA分子带有特定的UMI标签，有了这两个标签，就可以区分这个特定的cDNA与其他的cDNA=》

然后将乳浊液中所有的水相抽出来，也就是把带有标签的cDNA分子抽出来=》

cDNA分子加接头，PCR扩增，得到illumina文库
c25e1bcc-f57c-413b-87ab-1f2a7aa66118-image.png

数据构成
一个样本一般就测几百或几千细胞，barcode种类却有3百多万，所以很少出现一个barcode对应两个细胞的情况。因此得到的数据可以通过barcode拆分，将测序reads回溯到每个细胞

当然，是有可能出现一个barcode对应两个甚至多个细胞的情况，这时如果按照barcode去拆分，就会将这两个或者多个细胞的reads组合成一个**“pool**"。因此，为了减少pool的出现，就要在细胞混悬液制备阶段，控制原始的细胞数量

所以这里看到，并不是制备的细胞数越多越好。原始细胞数越少，最后的混合pool就越少，这也是符合泊松分布的。一般来说一个样本混悬液的细胞数在1万以下比较好

利用UMI对reads进行简并，就可以看到细胞reads与基因数量之间的关系，比如这样：横坐标是细胞reads数，纵坐标是基因数，reads数越多能得到的基因也就越多。一般来说一个细胞读到30万条reads后，基因数量随reads数增加的速度会变慢=》基因数量”平台期”

92e2ce6c-282f-46d2-8a61-9e0cb20f1933-image.png

一般一个细胞可以得到4万-8万个有效的UMI，平均一个细胞的一个基因有10个UMI；

一个细胞的一个基因的表达量是衡量这个细胞的一个维度，于是几千个被测基因的表达量形成了几千个维度。如果将成千上万个细胞放在一起分析，经过降维、聚类，放到一个三维空间并加上颜色，就形成了这样的分布形式
A

UMI的作用
• anneng

2

0
赞同

2
帖子

12
浏览

A

10x中有好几个标签，需要区分下：

Question: How does Cell Ranger correct for amplification bias?
Answer: Each transcript captured in the Single Cell 3' and V(D)J assay is labeled with a 10-12 bp Unique Molecular Identifier (UMI) in addition to a 16 bp cell barcode. After sequencing, the UMI is used to distinguish sequenced reads that originate from unique mRNA molecules vs PCR duplicates.
Multiple reads that match the same UMI, 10x barcode and gene are collapsed to a single UMI count in the gene-barcode UMI count matrix.

For more information please see: Gene-Barcode Matrices.

b0f169ec-a477-4e9b-bc74-89ed9f2d2e68-image.png
https://www.biorxiv.org/content/10.1101/065912v1.full
A

单细胞云平台涉及的软件调研分析
• anneng

1

0
赞同

1
帖子

16
浏览

尚无回复
A

生物信息分析任务清单
• anneng

1

0
赞同

1
帖子

4
浏览

尚无回复
A

二代微生物数据的有参组装
• anneng

1

0
赞同

1
帖子

10
浏览

尚无回复
A

多倍体植物的SNP分析
• anneng

4

0
赞同

4
帖子

13
浏览

A

https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-020-03704-1#availability-of-data-and-materials
Evaluation of variant calling tools for large plant genome re-sequencing
这个文章推荐的流程是bwa samtools/mpileup
A

gatk 对多倍体的支持 ploidy 参数
• anneng

2

0
赞同

2
帖子

8
浏览

A

https://bmcbioinformatics.biomedcentral.com/articles/10.1186/s12859-020-03704-1

Evaluation of variant calling tools for large plant genome re-sequencing
A

VCF的格式
• anneng

2

0
赞同

2
帖子

5
浏览

A

https://www.biostars.org/p/240965/
./.　表示没有数据no call
A

Docker在生信领域的应用
• anneng

1

0
赞同

1
帖子

5
浏览

尚无回复
A

单细胞轨迹分析
• anneng

3

0
赞同

3
帖子

7
浏览

A

Scanpy用的是PAGA 相关论文如下：
https://genomebiology.biomedcentral.com/articles/10.1186/s13059-019-1663-x
和Scanpy一样　PAGA也是　theislab　实验室出品的方法　
https://github.com/theislab/paga
A

最小的测序深度推荐
• anneng

1

0
赞同

1
帖子

4
浏览

尚无回复
A

ariant warehousing
• anneng

1

0
赞同

1
帖子

4
浏览

尚无回复
A

loupe browser的功能分析
• anneng

6

0
赞同

6
帖子

12
浏览

A

https://support.10xgenomics.com/single-cell-gene-expression/software/visualization/latest/tutorial-reclustering
可以设置不同的过滤条件和阈值进行聚类。
这些不同的聚类结果应该保存成不同的成果，这些成果可以删除或者查看。
A

Tiledb在基因组学的应用
• anneng

1

0
赞同

1
帖子

1
浏览

尚无回复

9 / 15

RT-PCR vs NGS • anneng

global biotic interactions 物种交互数据库 • anneng

cellxgene参考文档 • anneng

大队列研究项目汇总 • anneng

使用vep进行注释 • anneng

lumpy 结构突变分析流程 • anneng

10X 单细胞的一些概念 • anneng

UMI的作用 • anneng

单细胞云平台涉及的软件调研分析 • anneng

生物信息分析任务清单 • anneng

二代微生物数据的有参组装 • anneng

多倍体植物的SNP分析 • anneng

gatk 对多倍体的支持 ploidy 参数 • anneng

VCF的格式 • anneng

Docker在生信领域的应用 • anneng

单细胞轨迹分析 • anneng

最小的测序深度推荐 • anneng

ariant warehousing • anneng

loupe browser的功能分析 • anneng

Tiledb在基因组学的应用 • anneng

RT-PCR vs NGS
• anneng

global biotic interactions 物种交互数据库
• anneng

cellxgene参考文档
• anneng

大队列研究项目汇总
• anneng

使用vep进行注释
• anneng

lumpy 结构突变分析流程
• anneng

10X 单细胞的一些概念
• anneng

UMI的作用
• anneng

单细胞云平台涉及的软件调研分析
• anneng

生物信息分析任务清单
• anneng

二代微生物数据的有参组装
• anneng

多倍体植物的SNP分析
• anneng

gatk 对多倍体的支持 ploidy 参数
• anneng

VCF的格式
• anneng

Docker在生信领域的应用
• anneng

单细胞轨迹分析
• anneng

最小的测序深度推荐
• anneng

ariant warehousing
• anneng

loupe browser的功能分析
• anneng

Tiledb在基因组学的应用
• anneng