API documentation
calculate_pcr_product(sequence: FastaSequence, forward_primer: FastaSequence, reverse_primer: FastaSequence, min_product_length: Union[int, None] = None, max_product_length: Union[int, None] = None, header: bool = True, cols: str = 'all', output_file: Union[bool, str] = False) -> str
Returns the products amplified by a pair of primers against a single sequence.
This function is meant to be used by find_pcr_product, but if you'd like to use this separately, you must supply FastaSequence objects. A FastaSequence is just a convenient object to couple a header with a DNA string. For example, >>> forward_primer = FastaSequence("test_forward_primer", "ACTG") >>> reverse_primer = FastaSequence("test_reverse_primer", "ATTA") >>> target_sequence = FastaSequence("target_sequence", "ATGCTGATGCATGCTA")
Inputs
FastaSequence
The fasta sequence to test for amplification.
FastaSequence
The forward primer to use.
FastaSequence
The forward primer to use. Note that you should supply this
None | int
If provided, only return those products whose length are greater than or equal to this number. Defaults to None, which returns all products found.
None | int
If provided, only return those products whose length are less than or equal to this number. Defaults to None, which returns all products found.
bool | str
Whether or not to print a header on the results. Defaults to True. False will not print out the header.
str
Which columns to print out. Defaults to "all," which prints out all the columns. A string can be supplied to only output the strings of interest. For example, cols="fpri rpri pname" will only output the names of the forward primer, reverse primer, and the target sequence when a target is found. Available options are: fpri - the name of the forward primer rpri - the name of the reverse primer start - the start location of the product in the target sequence end - the end location of the product in the target sequence length - the length of the product pname - the name of the sequence in which the target was found pseq - the nucleotide sequnce of the amplified product
bool | str
The file to write the results out to. Defaults to False, which will not print anything out. Providing a string
will create that input file at that location. If set to True without providing a string, the output file
will be of the form
Outputs
A tab-separated string containing all of the products amplified by the primers contained in the primer file.
The fields are
- Forward primer name
- Reverse primer name
- Start position of the product in the target sequence
- End position of the product in the target sequence
- Product length
- The product
Source code in ispcr/__init__.py
18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 |
|
get_pcr_products(primer_file: str, sequence_file: str, min_product_length: Union[int, None] = None, max_product_length: Union[int, None] = None, header: Union[bool, str] = True, cols: str = 'all', output_file: Union[bool, str] = False) -> str
Returns all the products amplified by a set of primers in all sequences in a fasta file.
Inputs
str
The path to the fasta file containing the primers to be tested. Currently, this primer file is expected to only contain two sequences, with the forward sequence appearing first. For an example: >test_1.f AGTCA >test_2.r TTATGC
str
The path to the fasta file containing the sequences to test the primers against.
None | int
If provided, only return those products whose length are greater than or equal to this number. Defaults to None, which returns all products found.
None | int
If provided, only return those products whose length are less than or equal to this number. Defaults to None, which returns all products found.
bool
Whether or not to print a header on the results. Defaults to True. False will not print out the header.
str
Which columns to print out. Defaults to "all," which prints out all the columns. A string can be supplied to only output the strings of interest. For example, cols="fpri rpri pname" will only output the names of the forward primer, reverse primer, and the target sequence when a target is found. Available options are: fpri - the name of the forward primer rpri - the name of the reverse primer start - the start location of the product in the target sequence end - the end location of the product in the target sequence length - the length of the product pname - the name of the sequence in which the target was found pseq - the nucleotide sequnce of the amplified product
bool | str
The file to write the results out to. Defaults to False, which will not print anything out. Providing a string
will create that input file at that location. If set to True without providing a string, the output file
will be of the form
Outputs
A tab-separated string containing all of the products amplified by the primers contained in the primer file.
The fields are
- Forward primer name
- Reverse primer name
- Start position of the product in the target sequence
- End position of the product in the target sequence
- Product length
- The product
Source code in ispcr/__init__.py
140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 |
|
FastaSequence
Generic fasta sequence class.
utils
This module contains various utilities used during in silico PCR.
desired_product_size(potential_product_length: int, min_product_length: Union[int, None] = None, max_product_length: Union[int, None] = None) -> bool
Determines if a potential product's size is in the user's desired product range.
Inputs potential_product_length - int The length of the potential product
int | None
The minimum product size the user will accept. If None, there is no lower limit.
int | None
The maximum product size the user will accept. If None, there is no lower limit.
Outputs A boolean for whether the product length is between the min and max product length.
Example desired_product_size(100, 75, 125) True desired_product_size(200, 75, 125) False
Source code in ispcr/utils.py
24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 |
|
filter_output_line(output_line: str, column_indices: List[int]) -> str
Filters a single line of isPCR results based on selected column indices.
Source code in ispcr/utils.py
168 169 170 171 172 173 174 175 176 177 178 179 |
|
get_column_indices(header_string: str) -> List[int]
Returns column indices based on a column header string.
Source code in ispcr/utils.py
161 162 163 164 165 |
|
is_valid_cols_string(header_string: str) -> bool
Internal helper to check if a column header string is valid.
Source code in ispcr/utils.py
147 148 149 150 151 152 153 154 155 156 157 158 |
|
parse_selected_cols(cols: str) -> List[int]
Returns a list of int indices based on a column header string.
Source code in ispcr/utils.py
182 183 184 185 186 187 188 189 190 191 192 193 |
|
read_fasta(fasta_file: TextIO) -> Iterator[FastaSequence]
An iterator for fasta files.
Inputs
------
fasta_file: TextIO
An open file for reading
Outputs
-------
An iterator yielding the the sequence names and sequences from a fasta file
Example
-------
input_file = 'tests/test_data/sequences/met_r.fa.fasta'
with open(input_file) as fin:
for name, seq in read_fasta(fin):
print(f'{name}
{seq}')
Source code in ispcr/utils.py
97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 |
|
read_sequences_from_file(primer_file: str) -> List[FastaSequence]
Reads a fasta file, converts the sequences to FastaSequences, and returns them in a list.
Source code in ispcr/utils.py
135 136 137 138 139 140 141 142 143 144 |
|
reverse_complement(dna_string: str) -> str
Returns the reverse complement of a DNA string.
Inputs
str
A string representing a DNA sequence. Supported bases are A, C, G, and T.
Outputs
The reverse complement of dna_string.
Raises
KeyError Raised if there is a base in dna_string that is not one of ACGT.
Example
reverse_complement('GCTGA') 'TCAGC'
Source code in ispcr/utils.py
65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 |
|