perl的文件句柄不仅支持普通文件, 还支持管道,今天需要统计一个fastq文件中的序列数和碱基数,而NGS的fastq文件一般都是gzip压缩的,所以
需要读取压缩文件中的内容,代码如下:
my ($fastq) = @ARGV: my ($reads, $bases) = cal_sequence_info($fastq); print qq{$reads $basesn}; sub cal_sequence_info { my $fastq = shift; my $file_handle = $fastq =~ /gz$/ ? qq{$fastq} : qq{gzip -dc $fastq |}; open FASTQ, $file_handle or die "Can't open $fastq "; my ($reads, $bases); while (my $readid = <FASTQ>) { my $quality = <FASTQ>; my $comment = <FASTQ>; my $sequence = <FASTQ>; chomp($sequence); $reads++; $bases += length $sequence; } close FASTQ; return ($reads, $bases); }
直接利用linux中的管道, 这样可以方便的读取压缩文件中的内容