Skip to content

Commit aabfe06

Browse files
author
lboat
committed
Updated JShell post
1 parent 249137f commit aabfe06

File tree

1 file changed

+61
-36
lines changed

1 file changed

+61
-36
lines changed

how-to-use-bioinformatics-libraries-in-jshell.html

Lines changed: 61 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
<meta charset="utf-8">
55
<meta http-equiv="X-UA-Compatible" content="IE=edge">
66
<meta name="viewport" content="width=device-width, initial-scale=1">
7-
<meta name="description" content="export CLASSPATH=&#34;/home/lboat/IdeaProjects/htsjdk/build/classes/java/main/&#34; jshell -c /home/lboat/IdeaProjects/htsjdk/build/classes/java/main/ import...">
7+
<meta name="description" content="I&#39;m a big fan of Java&#39;s new read-evaluate-print-loop (REPL) environment, JShell. For beginners in Java, JShell provides a fantastic environment to troubleshoot issues and get comfortable with...">
88
<meta name="keywords" content="">
99
<link rel="icon" href="./favicon.ico">
1010

@@ -77,41 +77,66 @@ <h1 class="header-title">How-to: Use Bioinformatics Libraries in JShell</h1>
7777

7878
<!-- Content -->
7979
<div class="container content">
80-
<p>export CLASSPATH="/home/lboat/IdeaProjects/htsjdk/build/classes/java/main/"
81-
jshell -c /home/lboat/IdeaProjects/htsjdk/build/classes/java/main/</p>
82-
<p>import htsjdk.samtools.reference.ReferenceSequenceFile
83-
import htsjdk.samtools.reference.ReferenceSequenceFileFactory
84-
import htsjdk.samtools.reference.FastaSequenceIndexCreator
85-
import htsjdk.samtools.reference.FastaSequenceIndex
86-
import java.nio.file.Paths
87-
import java.nio.file.Path</p>
88-
<h1>Get the path to our file</h1>
89-
<p>Path path = Paths.get("/home/lboat/Documents/practice/Java/big.fa")</p>
90-
<h1>Generate a FASTA object</h1>
91-
<p>ReferenceSequenceFile fasta = ReferenceSequenceFileFactory.getReferenceSequenceFile(path)</p>
92-
<h1>Index our FASTA file</h1>
93-
<p>FastaSequenceIndex fsi = FastaSequenceIndexCreator.buildFromFasta(path)</p>
94-
<h1>Determine the number of sequences in our file</h1>
95-
<p>fsi.size()</p>
96-
<h1>Get sequence-specific information</h1>
97-
<p>fsi.getIndexEntry("sequence_5")
98-
contig sequence_5; location 5152; size 999; basesPerLine 60; bytesPerLine 61</p>
99-
<h1>Notice, our FASTA isn't recognized as indexed</h1>
100-
<p>fasta.isIndexed()
101-
false</p>
102-
<h1>Write our index</h1>
103-
<p>fsi.write(Paths.get("/home/lboat/Documents/practice/Java/big.fa.fai"))</p>
104-
<h1>Re-open the FASTA using this index</h1>
105-
<p>fasta = ReferenceSequenceFileFactory.getReferenceSequenceFile(path)</p>
106-
<h1>Notice, our FASTA is now indexed</h1>
107-
<p>fasta.isIndexed()
108-
true</p>
109-
<h1>You may want to check the first sequnce</h1>
110-
<p>ReferenceSequence rs = fasta.nextSequence()</p>
111-
<h1>Look at its name</h1>
112-
<p>rs.getName()</p>
113-
<h1>Look at the sequence</h1>
114-
<p>rs.getBaseString()</p>
80+
<p>I'm a big fan of Java's new read-evaluate-print-loop (REPL) environment, JShell. For beginners in Java, JShell provides a fantastic environment to troubleshoot issues and get comfortable with Java/new packages. As this is a bioinformatics blog, I will be demonstrating how to read FASTA files using the htsjdk (High-throughput sequencing Java Development Kit) library.</p>
81+
<p>First, we have to import all of the appropriate package information. Here, I've downloaded and compiled the library using my IntelliJ IDE (integrated development environment). So, I'm exporting the CLASSPATH to the Java bytecode and designating the CLASSPATH to JShell in bash.</p>
82+
<div class="highlight"><pre><span></span><span class="nb">export</span> <span class="nv">CLASSPATH</span><span class="o">=</span><span class="s2">&quot;/home/lboat/IdeaProjects/htsjdk/build/classes/java/main/&quot;</span>
83+
jshell -c /home/lboat/IdeaProjects/htsjdk/build/classes/java/main/
84+
</pre></div>
85+
86+
87+
<p>Now that JShell is open, we can import all of the necessary functions.</p>
88+
<div class="highlight"><pre><span></span><span class="kn">import</span> <span class="nn">htsjdk.samtools.reference.ReferenceSequenceFile</span>
89+
<span class="kn">import</span> <span class="nn">htsjdk.samtools.reference.ReferenceSequenceFileFactory</span>
90+
<span class="kn">import</span> <span class="nn">htsjdk.samtools.reference.FastaSequenceIndexCreator</span>
91+
<span class="kn">import</span> <span class="nn">htsjdk.samtools.reference.FastaSequenceIndex</span>
92+
<span class="kn">import</span> <span class="nn">java.nio.file.Paths</span>
93+
<span class="kn">import</span> <span class="nn">java.nio.file.Path</span>
94+
</pre></div>
95+
96+
97+
<p>Now we get the path to the FASTA file and generate FASTA object using a FactoryBuilder (A specialized Java class for generating objects).</p>
98+
<div class="highlight"><pre><span></span><span class="err">#</span> <span class="n">Get</span> <span class="n">the</span> <span class="n">path</span> <span class="n">to</span> <span class="n">our</span> <span class="n">file</span>
99+
<span class="n">Path</span> <span class="n">path</span> <span class="o">=</span> <span class="n">Paths</span><span class="o">.</span><span class="na">get</span><span class="o">(</span><span class="s">&quot;big.fasta&quot;</span><span class="o">)</span>
100+
101+
<span class="err">#</span> <span class="n">Generate</span> <span class="n">a</span> <span class="n">FASTA</span> <span class="n">object</span>
102+
<span class="n">ReferenceSequenceFile</span> <span class="n">fasta</span> <span class="o">=</span> <span class="n">ReferenceSequenceFileFactory</span><span class="o">.</span><span class="na">getReferenceSequenceFile</span><span class="o">(</span><span class="n">path</span><span class="o">)</span>
103+
104+
<span class="err">#</span> <span class="n">Index</span> <span class="n">our</span> <span class="n">FASTA</span> <span class="n">file</span>
105+
<span class="n">FastaSequenceIndex</span> <span class="n">fsi</span> <span class="o">=</span> <span class="n">FastaSequenceIndexCreator</span><span class="o">.</span><span class="na">buildFromFasta</span><span class="o">(</span><span class="n">path</span><span class="o">)</span>
106+
</pre></div>
107+
108+
109+
<p>The FASTA file is now indexed and we can readily check some specific metrics for this FASTA.</p>
110+
<div class="highlight"><pre><span></span><span class="err">#</span> <span class="n">Determine</span> <span class="n">the</span> <span class="n">number</span> <span class="n">of</span> <span class="n">sequences</span> <span class="n">in</span> <span class="n">our</span> <span class="n">file</span>
111+
<span class="n">fsi</span><span class="o">.</span><span class="na">size</span><span class="o">()</span>
112+
113+
<span class="err">#</span> <span class="n">Get</span> <span class="n">sequence</span><span class="o">-</span><span class="n">specific</span> <span class="n">information</span>
114+
<span class="n">fsi</span><span class="o">.</span><span class="na">getIndexEntry</span><span class="o">(</span><span class="s">&quot;sequence_5&quot;</span><span class="o">)</span>
115+
<span class="err">#</span> <span class="n">contig</span> <span class="n">sequence_5</span><span class="o">;</span> <span class="n">location</span> <span class="mi">5152</span><span class="o">;</span> <span class="n">size</span> <span class="mi">999</span><span class="o">;</span> <span class="n">basesPerLine</span> <span class="mi">60</span><span class="o">;</span> <span class="n">bytesPerLine</span> <span class="mi">61</span>
116+
117+
<span class="err">#</span> <span class="n">Notice</span><span class="o">,</span> <span class="n">our</span> <span class="n">FASTA</span> <span class="n">isn</span><span class="err">&#39;</span><span class="n">t</span> <span class="n">recognized</span> <span class="n">as</span> <span class="n">indexed</span>
118+
<span class="n">fasta</span><span class="o">.</span><span class="na">isIndexed</span><span class="o">()</span>
119+
<span class="err">#</span> <span class="kc">false</span>
120+
121+
<span class="err">#</span> <span class="n">Write</span> <span class="n">our</span> <span class="n">index</span>
122+
<span class="n">fsi</span><span class="o">.</span><span class="na">write</span><span class="o">(</span><span class="n">Paths</span><span class="o">.</span><span class="na">get</span><span class="o">(</span><span class="s">&quot;big.fa.fai&quot;</span><span class="o">))</span>
123+
124+
<span class="err">#</span> <span class="n">Re</span><span class="o">-</span><span class="n">open</span> <span class="n">the</span> <span class="n">FASTA</span> <span class="n">using</span> <span class="k">this</span> <span class="n">index</span>
125+
<span class="n">fasta</span> <span class="o">=</span> <span class="n">ReferenceSequenceFileFactory</span><span class="o">.</span><span class="na">getReferenceSequenceFile</span><span class="o">(</span><span class="n">path</span><span class="o">)</span>
126+
127+
<span class="err">#</span> <span class="n">Notice</span><span class="o">,</span> <span class="n">our</span> <span class="n">FASTA</span> <span class="n">is</span> <span class="n">now</span> <span class="n">indexed</span>
128+
<span class="n">fasta</span><span class="o">.</span><span class="na">isIndexed</span><span class="o">()</span>
129+
<span class="kc">true</span>
130+
131+
<span class="err">#</span> <span class="n">You</span> <span class="n">may</span> <span class="n">want</span> <span class="n">to</span> <span class="n">check</span> <span class="n">the</span> <span class="n">first</span> <span class="n">sequnce</span>
132+
<span class="n">ReferenceSequence</span> <span class="n">rs</span> <span class="o">=</span> <span class="n">fasta</span><span class="o">.</span><span class="na">nextSequence</span><span class="o">()</span>
133+
134+
<span class="err">#</span> <span class="n">Look</span> <span class="n">at</span> <span class="n">its</span> <span class="n">name</span>
135+
<span class="n">rs</span><span class="o">.</span><span class="na">getName</span><span class="o">()</span>
136+
137+
<span class="err">#</span> <span class="n">Look</span> <span class="n">at</span> <span class="n">the</span> <span class="n">sequence</span>
138+
<span class="n">rs</span><span class="o">.</span><span class="na">getBaseString</span><span class="o">()</span>
139+
</pre></div>
115140

116141

117142

0 commit comments

Comments
 (0)