<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[Accelerating Single-cell Bioinformatics with N-dimensional Arrays in the Cloud]]></title><description><![CDATA[<p dir="auto"><a href="/assets/uploads/files/1644810962799-accelerating-single-cell-bioinformatics-with-n-dimensional-arrays-in-the-cloud-ryan-williams.pptx">Accelerating Single-cell Bioinformatics with N-dimensional Arrays in the Cloud - Ryan Williams.pptx</a></p>
]]></description><link>http://an.forum.genostack.com/topic/547/accelerating-single-cell-bioinformatics-with-n-dimensional-arrays-in-the-cloud</link><generator>RSS for Node</generator><lastBuildDate>Sat, 13 Jun 2026 09:21:24 GMT</lastBuildDate><atom:link href="http://an.forum.genostack.com/topic/547.rss" rel="self" type="application/rss+xml"/><pubDate>Mon, 14 Feb 2022 03:56:40 GMT</pubDate><ttl>60</ttl><item><title><![CDATA[Reply to Accelerating Single-cell Bioinformatics with N-dimensional Arrays in the Cloud on Mon, 14 Feb 2022 03:57:29 GMT]]></title><description><![CDATA[<p dir="auto"><a href="https://github.com/lasersonlab/single-cell-experiments" rel="nofollow ugc">https://github.com/lasersonlab/single-cell-experiments</a></p>
<h2>项目说明</h2>
<p dir="auto">theis lab	# scanpy<br />
laserson lab	# single-cell-experiments (zappy,zarr,ndarray.scala)</p>
<ol>
<li>支持读取csv,adata,zarr,zarr_gcs(gcs,g3fs,谷歌亚/马逊云端数据)格式的单细胞数据</li>
<li>读取数据后依赖zarr包拆分数据成块(缺点:数据经过重复读取,每次数据读取都是全加载)</li>
<li>adata 数据取矩阵(.X属性的值)数据通过指定块大小后按下标索引map到不同的块对象,即PairedRDD(此时的value是zarr,可能为压缩格式,参考代码 zarr_spark.py#read_zarr_chunk|get_chunk_indices)</li>
<li>对RDD进行计算(参考代码anndata_spark.py#log1p)</li>
</ol>
<h2>该项目衍生的问题：</h2>
<ol>
<li>目前该项目无维护，源代码未指明依赖版本关系，无法运行</li>
<li>项目分析过程无法交互展示，必须定义流程过程和控制参数</li>
</ol>
]]></description><link>http://an.forum.genostack.com/post/1195</link><guid isPermaLink="true">http://an.forum.genostack.com/post/1195</guid><dc:creator><![CDATA[ice-melt]]></dc:creator><pubDate>Mon, 14 Feb 2022 03:57:29 GMT</pubDate></item></channel></rss>