朝简单处想 往认真处行


  • 首页

  • 标签

  • 分类

  • 归档

Lasso与宏基因组

发表于 2019-12-08 | 分类于 metagenomics
字数统计: 754 字 | 阅读时长 ≈ 3 分钟

因为项目需要,去了解了一下LASSO【The Least Absolute Shrinkage and Selection Operator】,又译最小绝对值收敛和选择算子、套索算法。鼓捣了几天下来,我也没有弄太懂,但是大概知道了怎么使用,特此记录一下。

  1. 首先是知乎的介绍:
    https://zhuanlan.zhihu.com/p/42122611

  2. 另外查的过程中发现,LASSO并不产生P值,并且不建议去计算P值,虽然原理没弄大懂,但是可以记住结论。同样,附上参考链接:
    Are p values generally reported in LASSO regressions? : statistics

  3. 如果要计算P值的话,得通过另外一种回归计算:
    https://stats.stackexchange.com/questions/84185/lasso-to-identify-important-variables-in-ordered-logistic-regression
    而我看到的一篇文章里是用有序回归(ordinal regression)计算的P值。
    Chen, D. Q. et al. Identification of serum metabolites associating with chronic kidney disease progression and anti-fibrotic effect of 5-methoxytryptophan. Nat. Commun. 10, 1–15 (2019).

  4. 怎么是使用LASSO分析微生物的数据:
    参考了这个脚本,该脚本只适用于二元分组的数据,并且接入的是 Biom 格式的文件:https://github.com/alifar76/MicrobeNets

同样因为项目需要,我基于该脚本按照我的理解修改成接入普通文件以及多组的情况,组别文件需要将字符型变量转化成数值型变量,脚本如下,不一定准确,仅供参考:
Biom 格式的也是类似的,只不过需要处理文件的最后一列的物种注释信息。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
## 用于非biom 格式文件, Genus
intable <- "Genus_abundance.xls" # 丰度文件
mapfile <- "Sample_Group.txt" # 分组文件,需要转化成数值型
outfile <- “Genus_lasso_output.xls"
measuretype <- "deviance” #这些设置的说明可以参考上边github的脚本
famaliy <- "multinomial" #同上

otutab <- read.table(intable,header = T, sep = "\t", check.names = T, row.names =1)
group <- read.table(mapfile,header = T, sep = "\t", check.names = T, row.names =1, comment.char= "")
otutab <- subset(as.matrix(otutab),select=rownames(group))
otutab <- t(otutab)

fitted = glmnet(as.matrix(otutab), group[,1],standardize=FALSE,alpha=1)# 1 mean lasso
#plot(fitted,xvar = "lambda", label = TRUE)

cv.dat = cv.glmnet(data.matrix(otutab),group[,1],grouped=FALSE,nfolds=10,alpha = 1,parallel=TRUE,type.measure="deviance",family="multinomial") #输出的时候会有 warning, 据我测试是和nfold有关;
#plot(cv.dat,main=paste("cv_",compid,"_",methodtype,sep=""))
coefval <- coef(cv.dat)

### 输出
write.table(as.matrix(t(c("Tax","coefficient"))),file=outfile, sep="\t",row.names=FALSE,col.names=FALSE,quote=FALSE,append=TRUE)

for (i in 1:length(row.names(coefval[[1]]))){
rowName = row.names(coefval[[1]])[i]
all_dat <- c()
if (rowName != "(Intercept)"){
if (coefval[[1]][i] != 0){
all_dat <- c(rowName, coefval[[1]][i])
write.table(as.matrix(t(all_dat)),file=outfile, sep="\t",row.names=FALSE,col.names=FALSE,quote=FALSE,append=TRUE)
}

}
}
  1. 效果如何?
    就我的结果来看,不是很理想。我另外使用MaAsLin2做了测试,发现后者的结果更加好一些。两种方法的重叠物种也不多。个人经验和意见,仅供参考。
    MaAsLin2

  2. Python如何实现LASSO?
    使用了该方法的文章以及脚本:
    Wilmanski, T. /et al./ Blood metabolome predicts gut microbiome α-diversity in humans. /Nat. Biotechnol./ doi:10.1038/s41587-019-0233-9
    GitHub - PriceLab/ShannonMets: ** Blood metabolome predicts gut microbiome α-diversity in humans**

Mac 下两步验证免密,免验证码登陆, 传输文件

发表于 2019-10-25 | 分类于 Mac
字数统计: 444 字 | 阅读时长 ≈ 2 分钟

出发点

公司的集群前段时间需要两步验证登陆,每次登陆以及传输文件都得拿起手机,看验证码,输入验证码,今天实在觉得这么「繁琐」的步骤无法忍受,就想去查一下有没有可以在桌面端获取验证码的方式,结果还真有:

https://www.cnblogs.com/jasondan/p/6508249.html

GitHub - stanzhai/GoldenPassport: A native implementation of Google Authenticator for Mac based on Swift

作者的想法和目的简直和我的一模一样,而且自己动手,丰衣足食,解决了问题,果然知识才是生产力啊。

既然有了现成的工具,我也直接优化我之前的脚本,实现即使有两步验证,也能无需输入密码验证码登陆以及传输文件的过程。脚本如下:

登陆

1
2
3
4
5
6
7
8
9
10
11
12
13
14
#!/usr/bin/expect -f
set user xxxx # xxxx is your account,使用的时候要把注释的内容去掉,不然会报错, 下同
set host IP # your IP address
set password xxxx # xxxx is your passsword
set code [exec sh -c {curl -s "http://localhost:17304" |grep -Eo "\d{6}"}] # get the verfication code from the local host; 在 expect 脚本里边使用 shell 命令; grep 捕获变量;
set timeout -1

spawn ssh $user@$host
expect "*assword:*"
send "$password\r"
expect "*erification code:*"
send "$code\r"
interact
expect eof

传输文件

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
#!/usr/bin/expect -f
set src_file [lindex $argv 0] # the file need scp
set dest_dir [lindex $argv 1] # the target directory
set user xxxx # xxxx is your account
set host IP # your IP address
set password xxxx # xxxx is your passsword
set code [exec sh -c {curl -s "http://localhost:17304" |grep -Eo "\d{6}"}]
set timeout -1

spawn scp -r $user@$host:$src_file $dest_dir
expect "*assword:"
send "$password\r"
expect "*erification code:*"
send "$code\r"
expect "100%"
expect eof

点击一下,完结撒花~

Mac 下安装 Windows 虚拟机,以及 Trados

发表于 2019-10-25 | 分类于 Mac
字数统计: 206 字 | 阅读时长 ≈ 1 分钟

因为需要安装一个只能在 Windows 系统下使用的软件,所以花时间尝试了在 Mac 下安装虚拟机以及安装软件,记录一下

首先尝试了 VirtualBox
Mac用VirtualBox虚拟机安装win10教程 - 简书
结果不论是安装 Win7 还是 Win10,都失败了,而且界面看起来很奇怪

然后尝试了通过 ParallelsDesktop 安装
MAC安装Windows虚拟机 - 简书

文档里的百度网盘链接下载太慢,可以另外找载源下载

Win10 破解 http://www.ylmfwin100.com/ylmf/8643.html
但是我这种方法也失败了,可以重新网上找序列码即可

后边再安装 Trados 就可以,随便搜一搜就有现成的

个人感觉 ParallelsDesktop 要比 VirtualBox 好用很多,有钱的话还是多多支持付费软件:)

Literature

发表于 2019-09-26
字数统计: 948 字 | 阅读时长 ≈ 3 分钟

人类肠道细菌分离培养物种库 + 纵向多组学数据促进微生物组机制研究

A library of human gut bacterial isolates paired with longitudinal multiomics data enables mechanistic microbiome research

打算采用『如何阅读一本书』里提及的方法,用于阅读文献,理论指导实践,看看效果

一、文章\书大体上是在谈什么?

按照种类和主题分类

肠道微生物;宏基因组;数据库;

用最简短的句子说出整本书/文献在谈什么

构建了一个微生物数据库,包含 7758 个肠道菌株,以及相应的 3632 个基因组,同时包含了部分纵向对组学的数据。结果发现:

  1. 在个体之内和之间微生物物种维持稳定数量;
  2. 天的平均数据的多组学研究方法更加可靠;
  3. 体内的肠道代谢物变化和氨基酸水平相关;
  4. 不同个体(人群)之间的差异和胆汁酸差异有关;
  5. 基因组变化可用于推测个体内菌株共进化动态和体内的选择压力(个体内的菌株);

按照顺序和关系,列出全书的重要部分。将全书的纲要拟出来以后,再将各个部分的纲要也一一列出来

1. 分离培养大量肠道细菌用于体外或者体内验证机制假说

Isolation of an extensive collection of gut bacterial isolates for in vitro and in vivo testing of mechanistic hypotheses

1.1 从 OpenBiome 供体提取构建分离培养物种库,以覆盖肠道细菌的多样性

Building a library of isolates that cover the diversity of gut bacteria from OpenBiome donors

1.2 BIO-ML 包含多种和人类健康相关的物种

The BIO-ML contains diverse taxa associated with human health

2. 从分离培养的基因组中推测生态学和进化多态性

Ecology and evolutionary dynamics inferred from isolate genomes

2.1 BIO-ML 分离培养基因组的质量和多样性

Quality and diversity of BIO-ML isolate genomes

2.2 对乙醇的抗性比之前认为的更加广泛(物种),而且不限于孢子形成物种

Resistance to ethanol (乙醇) is more widespread than previously thought and not restricted to spore-formers

2.3 分类培养基因组的大量取样揭示了肠道共生菌长期和短期的进化

Extensive sampling of isolate genomes reveals the long- and short-term evolution of gut commensal bacteria.

3. FMT 供体的高精度基因组时间序列

High-resolution genomic time series from FMT donors

4. 基于宏基因组和 16S 的时间序列数据改善了丰度估计和生态学推测

Time-series data improve abundance estimations and ecological inferences from metagenomic and 16S data

5. 个体内细菌基因组的多样化和生活-历史特征与肠道生态系统的生态稳定性和扰乱相关

Bacterial genomic diversification within individuals and life-history traits are associated with ecological stability and disturbance of the gut ecosystem

6. 个体间粪便代谢组的胆汁酸谱有差异,而个体内的变化大部分是氨基酸导致的

Donor fecal metabolomes can be distinguished by their bile- acid profiles, while within-donor variation is driven largely by amino acids.

作者在问的问题或者想要解决的问题

肠道微生物和人类宿主的相互作用研究受限于缺乏纵向队列研究数据(探索其稳定性和动态变化),而且分离培养的菌株太少无法用于验证机制假说

二、文章/书📚的详细内容是什么?是如何写出来的?

由于好多地方没大看懂… 所以这部分先略过吧…. hhhhh; 我大概只做到了检视阅读吧…
属于我不擅长的阅读内容

三、这是真实的吗?有意义吗?像是沟通知识一样的评论一本书

还没有到这种水平… (⇀‸↼‶)


To be continued…

碎碎念

发表于 2019-09-24
字数统计: 58 字 | 阅读时长 ≈ 1 分钟

大爷

北京的太阳如此晃眼
以至于
我分不清
捡破烂的大爷
和
打麻将的大爷
哪个才是
真正的大爷…

20190924 读大爷收租有感, 以及太阳晃眼也是真的…

Model for Microbiome Research Writing

发表于 2019-09-18 | 分类于 metagenomics
字数统计: 1.1k 字 | 阅读时长 ≈ 6 分钟

This idea comes from: Writing and Presenting in English - The Rosetta Stone of Science
想法源自《科技英文写作与讲演-科学的罗塞塔石碑》

1. Some nice sentences | 零零碎碎

  1. Averaging multiple timepoints may be optimal for precisely quantifying abundances of bacterial taxa and functions within individuals. However, there has not been a quantitative assessment of how much improvement is possible, or of how many samples are needed.

  2. Using our longitudinal dataset, we found that each person harbored a stable and unique microbiome structure, both in terms of taxa and broad functional categories

  3. However, we found that the relative abundance of a given ASV (equivalent to 100% OTUs) and of a given clusters of orthologous groups (COG) category fluctuated substantially from day-to-day, but the median relative abundance remained relatively constant

  4. The number of bacterial genes encoded within the human gut vastly outnumber the total complement of genes in Homo sapiens, endowing the gut microbiome with enormous potential for the production of a range of functionally active metabolites.

  5. The mechanisms linking gut microbiota to cardiovascular disease (CVD) are multifaceted, and include direct effects of microbial metabolites on atherosclerosis and thrombosis development, as well as immune modulation by bacteria and their products.

  6. Across >8,000 measured metabolite features, we identified chemicals and chemical classes that were differentially abundant in IBD, including enrichments for sphingolipids and bile acids, and depletions for triacylglycerols and tetrapyrroles.

  1. Metabolomic and metagenomic profiles were broadly correlated with faecal calprotectin levels (a measure of gut inflammation)

  2. We demonstrate a characteristic increase in facultative anaerobes at the expense of obligate anaerobes,as well as molecular disruptions in microbial transcription (for example, among clostridia), metabolite pools (acylcarnitines, bile acids, and short-chain fatty acids), and levels of antibodies in host serum.

  3. Here we present the results, which provide a comprehensive view of functional dysbiosis in the gut microbiome during inflammatory bowel disease activity

  4. Through a number of in vivo and in vitro technologies, Yuan et al. report that high- alcohol-producing Klebsiella pneumoniae (HiAlc Kpn) occurs in a large percentage of individuals with nonalcoholic fatty liver disease (NAFLD) in a Chinese cohort.

  5. The underlying etiology of nonalcoholic fatty liver disease (NAFLD) is believed to be quite varied

  6. Changes in the gut microbiota have been investigated and are believed to contribute to at least some cases of the disease, though a causal relation- ship remains unclear.

  7. Microbial communities associated with animals exert powerful influences on host physiology, regulating metabolism and immune function, as well as complex host behaviors.

  8. The importance of host–microbiome interactions for maintaining homeostasis and promoting health raises evolutionarily complicated questions about how animals and their microbiomes have coevolved and how these relationships affect the ways that animals interact with their environment.

  9. Here, we review the literature on the contributions of host factors to microbial community structure and corresponding influences of microbiomes on emergent host phenotypes.We focus in particular on animal behaviors as a basis for understanding potential roles for the microbiome in shaping host neurobiology

  10. In certain cases, these connections have an experimentally identified biochemical basis, but in others, these relationships are poorly defined.

  11. This is especially true when considering the human microbiome, in which incredible amounts of diversity and complexity are observed, but ethical considerations limit experimental potential.

  12. Here, we show that a plant diet served raw versus cooked reshapes the murine gut microbiome,with effects attributable to improvements in starch digestibility and degradation of plant-derived compounds.

  13. Therefore, whether BCAA are the cause per se, an epiphenomenon of, or indicators of cardio-metabolic disturbance remains the paramount question.

2. Ending | 结尾

  1. Still, together our findings suggest that unique and nonlinear changes of the intestinal ecosystem might exist in Pre-DM individuals before transition to T2D.

  2. Further large-scale, longitudinal follow-up studies are needed to delineate how microbial functions changes from prediabetes to diabetes and to address the nature of interactions be- tween the gut microbiota and the host in the transitional phases leading to overt T2D.

  3. Further investigation is warranted to define the role of these amino acids in atherosclerosis and CVD, which may serve as a basis for the development of anti-atherogenic nutritional and therapeutic approaches.

3. Beginning | 开头

  1. Mucosal immunology research continues its fascination with microbial metabolites. In 2019, researchers uncovered extended functions for microbial metabolites in immunity , deepening our understanding of the regulation and function of metabolite-reactive immune cells, and revealed the receptors by which immune cells can recognize bioactive microbial metabolites

  2. The rise of ancient genomics has revolutionised our understanding of human prehistory but this work depends on the availability of suitable samples.

  1. Intestinal dysbiosis could act as an early environmental modulator and may be a target of future preventive interventions in individuals at risk of RA, before the onset of the disease.

  2. Our results, together with previous studies in patients with early RA and recent mechanistic studies, support the mucosal origins hypothesis and the role of intestinal dysbiosis in the development of RA.

  3. Hundreds of clinical studies have demonstrated associations between the human microbiome and disease, yet fundamental questions remain on how we can generalize this knowledge.

  4. Clinical, genetic and microbiome evidence supports the contention that ankylosing spondylitis (AS) is influenced by interactions between the gut microbiome and the host immune system.

  5. The majority of microbiome studies to date have been conducted via sequencing of the 16S ribosomal marker gene, limiting the scope and accuracy of downstream analytics.

  6. One feature of inflammation-associated gut microbiotas is enrichment of motile bacteria, which can facilitate microbiota encroachment into the mucosa and activate pro-inflammatory gene expression.

  7. Here, we set out to investigate whether elicitation of mucosal anti-flagellin antibodies by direct administration of purified flagellin might serve as a general vaccine against subsequent development of chronic gut inflammation.

4. Verb | 动词

  1. We show, in mice, that repeated injection of flagellin elicits increases in fecal anti-flagellin IgA and alterations in microbiota composition, reduces fecal flagellin concentration, prevents microbiota encroachment, protects against IL-10 deficiency-induced colitis, and ameliorates diet-induced obesity.

  2. Thus, administration of flagellin, and perhaps other pathobiont antigens, may confer some protection against chronic inflammatory diseases.

5. Transition | 过渡

待补充;

6. Structure | 结构

待补充

7. Headline | 标题

The microbiome in cancer immunotherapy: Diagnostic tools and therapeutic strategies

Mac 下远程使用终端

发表于 2019-09-05 | 分类于 Mac
字数统计: 517 字 | 阅读时长 ≈ 2 分钟

包含三部分, 1.终端的配置; 2.集群免密登陆; 3.本地集群免密文件传输;

1. 终端配置

推荐使用配方: iTerm2 + Oh my zsh + zsh,教程网上有一大堆,优点也不再赘述,下边是一个参考链接: Mac下iTerm 2语法高亮配置及附带美化 - 简书

2. 集群免密登陆

好处是不用每次都输入 IP 和密码,个人感觉比 Putty 什么的好用很多;

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
## 1. 首先写一个 expect 脚本, 用于自动交互, 以 login.expect 为例
# login.expect
#!/usr/bin/expect -f
set user account ## 集群账号
set host IP ## 集群 IP
set password xxxx ## xxxx 为集群登陆密码
set timeout -1

spawn ssh $user@$host
expect "*assword:*"
send "$password\r"
interact
expect eof

## 2. 然后命令行使用 expect login.expect 登陆即可;
#如果嫌麻烦的话,可以在 .zshrc (或者 .bashrc) 文件里设置如下:
alias lg="expect login.expect"
#这样每次输入 lg 即可登陆,同理多个 IP 可设置多个脚本来配置运行;

3. 本地 <=> 集群文件免密文件传输

如果喜欢界面交互的话,可以使用一些现成的软件,比如 FileZilla - The free FTP solution
如果喜欢命令行的话,也可以通过脚本来传输,方法和上边类似;

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
## 1. 首先写一个 expect 脚本, 以 scp.expect 为例
# scp.expect
#!/usr/bin/expect -f
set src_file [lindex $argv 0] #用于外接参数
set dest_dir [lindex $argv 1] #用于外接参数
set user account #集群账号
set host IP #集群 IP
set password xxxx #集群密码
set timeout -1

spawn scp -r $user@$host:$src_file $dest_dir #集群到本地
# spawn scp -r $src_file $user@$host:$dest_dir #本地到集群
# spawn rsync -avl $user@$host:$src_file $dest_dir # 使用 rsync 传输
expect "*assword:"
send "$password\r"
expect "100%"
expect eof

## 2. 然后命令行使用 expect scp.expect 登陆即可;
#如果嫌麻烦的话,可以在 .zshrc (或者 .bashrc) 文件里设置如下:
alias scp="expect scp.expect"
#这样每次输入 scp 即可登陆,同理多个 IP 可设置多个脚本来配置运行;

集群上使用 SparCC

发表于 2019-09-05 | 分类于 metagenomics
字数统计: 258 字 | 阅读时长 ≈ 1 分钟

因为该软件需要依赖特定版本的 python 和 特定版本的库,所以使用虚拟环境是一种比较省力的选择,尤其是在比较复杂而且不是自己可控的集群环境下;

脚本如下, 待完善

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
#设置 conda 默认虚拟环境安装路径
conda config --add envs_dirs yourPath
conda config --add pkgs_dirs yourPath
export PATH=condaPath:$PATH #到 bin 即可

#然后进行安装
conda create --name SparCCEnv python=2.6.9
source activate SparCCEnv
conda install numpy=1.9.2
conda install pandas=0.16.2

#第三方使用
cp yourhomePath/.condarc homePath #将上一步安装者 home 路径下的 condarc 文件拷贝到第三方的 home 下
export PATH=condaPath:$PATH #到 bin 即可
source activate SparCCEnv #进入虚拟环境

#使用脚本运行,下边这个脚本是我自己写的 如有需要可以联系我
python run_sparcc.py profile.xls mgs Outdir

#退出虚拟环境
source deactivate

参考链接

  1. SparCC_bitbucket
  2. SparCC · hallamlab/utilities Wiki · GitHub
  3. yonatanf / SparCC / issues / #18 - Error: tuple index out of range — Bitbucket
  4. Inferring Correlation Networks from Genomic Survey Data
  5. GitHub - zdk123/SpiecEasi: Sparse InversE Covariance estimation for Ecological Association and Statistical Inference

皮肤微生物设计

发表于 2019-09-05 | 分类于 metagenomics
字数统计: 59 字 | 阅读时长 ≈ 1 分钟

皮肤微生物设计

  1. 如何取获得足够的生物量并且避免污染
    How to Design a Skin Microbiome Study, Part I: Sampling
  2. 选取测序策略,讨论了 v1-3 以及 v4
    https://blog.microbiomeinsights.com/design-skin-microbiome-study-part-ii-amplicon-sequencing

脑肠轴

发表于 2019-08-29 | 分类于 metagenomics
字数统计: 147 字 | 阅读时长 ≈ 1 分钟

Brain-gut connection explains why integrative treatments can help relieve digestive ailments

Brain-gut connection explains why integrative treatments can help relieve digestive ailments - Harvard Health Blog - Harvard Health Publishing

脑肠轴解释为什么联合治疗能够缓消化疾病
总结:肠道和大脑可以相互作用,因此生病的时候保持好心情非常重要,饮食也很重要;
压力和抑郁等能够通过交感神经影响肠道运动以及内容物,使得消化系统对疼痛更加敏感,改变肠道通透性,影响免疫系统,增加炎症,改变肠道微生物组成。

1…89

Nonewood

90 日志
3 分类
59 标签
© 2024 Nonewood | Site words total count: 144.4k
由 Hexo 强力驱动
|
主题 — NexT.Pisces v5.1.4