API Reference
This section contains the API of the modules and functions.
Command line interface querynator.
- querynator.__main__.query_api_civic(*args: Any, **kwargs: Any) Any
- querynator.__main__.query_api_cgi(*args: Any, **kwargs: Any) Any
- querynator.__main__.create_report(*args: Any, **kwargs: Any) Any
- querynator.__main__.Cancer()[source]
Function to create instance of click.Choice EnumType with cancer types
source: https://www.cancergenomeinterpreter.org/js/cancertypes.js
- Returns:
Enumeration of cancer types
- Return type:
click.Choice EnumType
- class querynator.__main__.EnumType(enum, case_sensitive=False)[source]
This is a class for a click.Choice of type EnumType
- convert(value, param, ctx)[source]
Convert the value to the correct type. This is not called if the value is
None(the missing value).This must accept string values from the command line, as well as values that are already the correct type. It may also convert other compatible types.
The
paramandctxarguments may beNonein certain situations, such as when converting prompt input.If the value cannot be converted, call
fail()with a descriptive message.- Parameters:
value – The value to convert.
param – The parameter that is using this type to convert its value. May be
None.ctx – The current context that arrived at this value. May be
None.
- querynator.__main__.filter_vcf_by_vep(vcf_path, logger)[source]
Function to filter given vcf to remove synonymous and low impact variants based on VEP annotation
- Parameters:
vcf_path (str) – Variant Call Format (VCF) file (Version 4.2)
- Returns:
list of lists of pyVCF3 records (input file, removed, filtered)
- Return type:
list
- querynator.__main__.get_unique_querynator_dir(querynator_output)[source]
add index if “querynator_results” already exists in user given out dir
- Parameters:
querynator_output (str) – path to store querynator results
- Returns:
unique result directory
- Return type:
str
- querynator.__main__.make_enum(values)[source]
Function to create an EnumType from a dict
- Parameters:
values (dict) – json/dict like object with {key: value} pairs
- Returns:
enumeration
- Return type:
Enum
- querynator.__main__.write_vcf(vcf_template, vcf_record_list, out_name)[source]
Function to write a vcf file from list of pyvcf3 records to result directory
- Parameters:
vcf_header (pysam header object) – pysam header object from input vcf
vcf_record_list (list) – list of pysam records
out_name (str) – name for the created vcf file
- Returns:
None
- Return type:
None
Query the cancergenomeinterpreter (CGI) via it’s Web API
- querynator.query_api.cgi_api.add_cgi_metadata(url, output, original_input, genome, filter_vep)[source]
Attach metadata to cgi query
- Parameters:
url (str) – API url with job_id
output (str) – sample name
filter_vep (bool) – flag whether VEP based filtering should be performed
- Returns:
None
- Raises:
BadZipfile
- querynator.query_api.cgi_api.delete_job_cgi(url, headers, output, logger)[source]
Delete query from the CGI server after analysis is complete
- Parameters:
url (str) – API url with job_id
headers (dict) – Valid headers for API query
output (str) – sample name
- Raises:
Exception
- querynator.query_api.cgi_api.download_cgi(url, headers, output, logger)[source]
Download query results from cgi
- Parameters:
url (str) – API url with job_id
headers (dict) – Valid headers for API query
output (str) – sample name
- Raises:
Exception
- querynator.query_api.cgi_api.hg_assembly(genome)[source]
Use correct assembly name
- Parameters:
genome (str) – Genome build version, defaults to hg38
- Returns:
genome
- Return type:
str
- querynator.query_api.cgi_api.query_cgi(mutations, cnas, translocations, genome, cancer, headers, logger, output, original_input, filter_vep)[source]
Actual query to cgi
- Parameters:
mutations (str) – Variant file (vcf,tsv,gtf,hgvs)
cnas (str) – File with copy number alterations
translocations (str) – File with translocations
genome (str) – Genome build version
email (str) – To query cgi a user account is needed
cancer (str) – Cancer type from cancertypes.js
logger – prints info to console
output (str) – sample name
- querynator.query_api.cgi_api.status_done(url, headers, logger)[source]
Check query status
- Parameters:
url (str) – API url with job_id
headers (dict) – Valid headers for API query
- Raises:
HTTPError
- Returns:
True if query performed successfully
- Return type:
bool
- querynator.query_api.cgi_api.submit_query_cgi(mutations, cnas, translocations, genome, cancer, headers, logger)[source]
Function that submits the query to the REST API of CGI
- Parameters:
mutations (str) – Variant file (vcf,tsv,gtf,hgvs)
cnas (str) – File with copy number alterations
translocations (str) – File with translocations
genome (str) – CGI takes hg19 or hg38
email (str) – To query cgi a user account is needed
cancer (str) – Cancer type from cancertypes.js
token (str) – user token for CGI
logger – prints info to console
- Returns:
API url with job_id
- Return type:
str
Query the Clinical Interpretations of Variants In Cancer (CIViC) API via its python tool CIViCPY
- querynator.query_api.civic_api.access_civic_by_coordinate(coord_dict, logger, build)[source]
Query CIViC API for individual variants
- Parameters:
coord_list (list) – List of CoordinateQuery objects
build (str) – reference genome
- Returns:
CIViC variant objects of successfully queried variants
- Return type:
list
- querynator.query_api.civic_api.add_civic_metadata(out_path, input_file, search_mode, genome, filter_vep)[source]
Attach metadata to civic query
- Parameters:
out_path (str) – Name of directory in which results are stored
input_file (str) – path of original input file
search_mode (str) – search mode used in CIViC Query
filter_vep (bool) – flag whether VEP based filtering should be performed
- Returns:
None
- Return type:
None
- querynator.query_api.civic_api.append_to_dict(dict1, dict2)[source]
appends values of a dictionary to another dictionary with lists as values :param dict1: dictionary to append to :type dict1: dict :param dict2: dictionary with values to append :type: dict2: dict :return: appended dict :rtype: dict
- querynator.query_api.civic_api.check_vcf_input(vcf_path, logger)[source]
Checks whether input is vcf-file with all necessary columns.
- Parameters:
vcf_path (str) – Variant Call Format (VCF) file (Version >= 4.0)
- Returns:
None
- querynator.query_api.civic_api.concat_dicts(coord_id_dict, variant_obj, filter_vep)[source]
Create and combine different dictionaries created for single CIViC variant object
- Parameters:
coord_obj (CIViC CoordinateQuery Object) – CoordinateQuery Object to respective variant object
variant_obj – single CIViC variant object
- Returns:
All information for respective CIViC variant object
- Return type:
dict
- querynator.query_api.civic_api.create_civic_results(variant_list, out_path, logger, filter_vep)[source]
Combine result dictionaries of all CIViC variant objects to a table and write it to user-specified file
- Parameters:
variant_list (list) – List of CIViC variant objects of successfully queried variants
out_path (str) – Name for directory in which result-table will be stored
filter_vep (bool) – flag whether VEP based filtering should be performed
- Returns:
None
- Return type:
None
- querynator.query_api.civic_api.get_assertion_information_from_variant(variant_obj)[source]
Get all assertion information from a single CIViC variant object
- Parameters:
variant_obj – single CIViC variant object
- Returns:
Assertion information for respective CIViC variant object
- Return type:
dict
- querynator.query_api.civic_api.get_coordinates_from_vcf(input, build, logger)[source]
Read in vcf file using “pyVCF3”, creates CoordinateQuery objects for each variant. This function does find (ref-alt): SNPs (A-T) DelIns (AA-TT) Deletions (TTTCA - AT)
- Parameters:
input (list or str) – list of pyVCF3 records or vcf file to query
build (str) – reference genome
- Returns:
CoordinateQuery objects
- Return type:
list
- querynator.query_api.civic_api.get_evidence_information_from_variant(variant_obj)[source]
Get all evidence from a single CIViC variant object
- Parameters:
variant_obj – single CIViC variant object
- Returns:
Evidence information for respective CIViC variant object
- Return type:
dict
- querynator.query_api.civic_api.get_gene_information_from_variant(variant_obj)[source]
Get all gene information from a single CIViC variant object
- Parameters:
variant_obj – single CIViC variant object
- Returns:
Gene information for respective CIViC variant object
- Return type:
dict
- querynator.query_api.civic_api.get_molecular_profile_information_from_variant(variant_obj)[source]
Get all molecular profile information from a single CIViC variant object
- Parameters:
variant_obj – single CIViC variant object
- Returns:
Molecular profile information for respective CIViC variant object
- Return type:
dict
- querynator.query_api.civic_api.get_positional_information_from_coord_obj(coord_obj)[source]
Get information about the position of the variant in the genome
- Parameters:
coord_obj (CIViC CoordinateQuery Object) – CoordinateQuery Object to respective variant object
- Returns:
Positional information for respective CIViC variant object
- Return type:
dict
- querynator.query_api.civic_api.get_querynator_id(querynator_id)[source]
Get the querynator id in dict format
- Parameters:
querynator_id (str) – Querynator id
- Returns:
Querynator id for respective CIViC variant object
- Return type:
dict
- querynator.query_api.civic_api.get_variant_information_from_variant(variant_obj)[source]
Get all variant information from a single CIViC variant object
- Parameters:
variant_obj – single CIViC variant object
- Returns:
Variant information for respective CIViC variant object
- Return type:
dict
- querynator.query_api.civic_api.query_civic(vcf, out_path, logger, input_file, genome, filter_vep)[source]
Command to query the CIViC API
- Parameters:
vcf (str or list) – Variant Call Format (VCF) file (Version 4.2) or list of pyVCF3 variant records
out_path (str) – Name for directory in which result-table will be stored
input_file (str) – path of original input file
filter_vep (bool) – flag whether VEP based filtering should be performed
- Returns:
None
- Return type:
None
- querynator.query_api.civic_api.smoothen_dict(dict, s)[source]
makes string out of lists
- Parameters:
dict (dict) – dict with lists as values
s (bool) – True if string and special string character needed
- Returns:
dict with strings as values
- Return type:
dict
- querynator.query_api.civic_api.sort_coord_list(coord_dict)[source]
Sort the input list to the bulk search
- Parameters:
coord_list (list) – List of CoordinateQuery objects
- Returns:
sorted coordinates
- Return type:
list
- querynator.query_api.civic_api.sort_rules(s)[source]
Set rules to correctly sort chromosomes X,Y,M
- Parameters:
s (str) – “string” chromosome (X,Y,M)
- Returns:
integer to sort by
- Return type:
int
- querynator.query_api.civic_api.vcf_file(vcf_path)[source]
Checks whether input is vcf-file.
- Parameters:
vcf_path (str) – Variant Call Format (VCF) file (Version 4.2)
- Returns:
None
Create one report of the querynator results and individual reports for each variant
- querynator.report_scripts.create_report.add_variant_name_report(df)[source]
Adds a column with a name of the variant for the report to the df
- Parameters:
df (pandas DataFrame) – result df
- Returns:
List of variant names
- Return type:
list
- querynator.report_scripts.create_report.assign_comb_evidence_labels(row)[source]
Assign the evidence labels for each Knowledgebase
- Parameters:
row (pandas DataFrame row) – Row of the variant dataframe
- Returns:
Evidence labels for each Knowledgebase
- Return type:
str
- querynator.report_scripts.create_report.check_if_nan(value)[source]
Checks if a value is NaN and returns an empty string if it is. :param value: The value to check. :type value: str :return: The value if it is not NaN, otherwise an empty string. :rtype: str
- querynator.report_scripts.create_report.create_barplot(input, title, out_path)[source]
Creates and saves a barplot as png
- Parameters:
input (pandas.DataFrame) – input dataframe
title (str) – title of the plot
out_path (str) – output path
- Returns:
matplotlib figure
- Return type:
matplotlib figure
- querynator.report_scripts.create_report.create_evidence_table(row, width_dict)[source]
Creates a table containing CIViC evidence information for a specific variant. :param row: The row of the dataframe. :type row: pandas.Series :param width_dict: A list containing the width of each column. :type width_dict: list
- querynator.report_scripts.create_report.create_html_link(s)[source]
Creates a clickable link from a string. Used to convert file path into clickable form.
- Parameters:
s (str) – The string to create a link from.
- Returns:
The string as a clickable link.
- Return type:
str
- querynator.report_scripts.create_report.create_link_col(row, report_path)[source]
Creates a link to the individual report to display in the overall report
- Parameters:
row (pandas Series) – row of the df
report_path (str) – path to the directory in which the individual reports will be saved in
- Returns:
link to the individual report
- Return type:
str
- querynator.report_scripts.create_report.create_report_htmls(outdir, basename, civic_path, logger)[source]
Creates the overall report for all variants and the individual reports for each variant
- Parameters:
outdir (str) – Path to report directory
basename (str) – User given Project name
- Civic_path:
Path to civic results
- Returns:
None
- Return type:
None
- querynator.report_scripts.create_report.create_therapy_table(row, response, width_dict, biomarkers_df)[source]
Creates a HTML table containing all Therapy & Drug related information provided by CGI for a specific Protein Change.
- Parameters:
row (pandas.Series) – The row of the dataframe.
response (str) – The response to a specific drug.
width_dict (list) – A list containing the width of each column.
biomarkers_df (pandas.DataFrame) – The biomarkers dataframe from CGI.
- Returns:
A HTML table containing all Therapy & Drug related information provided by CGI for a specific Protein Change.
- Return type:
HTML table
- querynator.report_scripts.create_report.create_tier_table(df, tier, report_path)[source]
Creates a table for the report for the given tier
- Parameters:
df (pandas DataFrame) – df containing all variants
tier (int) – tier to create the table for
report_path (str) – path to the directory in which the individual reports will be saved in
- Returns:
df containing the variants of the given tier
- Return type:
pandas DataFrame
- querynator.report_scripts.create_report.create_upsetplots(df, out_path)[source]
Create upsetplot of (1) the number of variants per Knowledgebase and (2) the number of variants per tier
- Parameters:
df (pandas.DataFrame) – Variant dataframe
out_path (str) – Path to output directory
- Returns:
List of Upsetplot figures
- Return type:
list
- querynator.report_scripts.create_report.encode_upsetplot(fig)[source]
Encodes an upsetplot figure as base64
- Parameters:
fig (matplotlib figure) – Upsetplot figure
- Returns:
Encoded upsetplot figure as string to add to report
- Return type:
str
- querynator.report_scripts.create_report.get_KB_count(df)[source]
Get the count of variants for each Knowledgebase to add to the pieplot
- Parameters:
df (pandas DataFrame) – Variant dataframe
- Returns:
Count of variants for each Knowledgebase
- Return type:
pandas DataFrame
- querynator.report_scripts.create_report.get_disease_names_CIViC(row)[source]
Get CIViC’s disease names for the variant
- Parameters:
row (pandas DataFrame row) – Row of the variant dataframe
- Returns:
Disease names for the variant
- Return type:
str
- querynator.report_scripts.create_report.get_evidence_description(row)[source]
takes in string of evidence descriptions and returns them as a HTML list
- Parameters:
row (pandas.core.series.Series) – row of the dataframe
- Returns:
string of evidence descriptions
- Return type:
str
- querynator.report_scripts.create_report.get_reference_build(metadata_path)[source]
Gets the reference build from the metadata file. :param metadata_path: The path to the metadata file. :type metadata_path: str :return: The reference build. :rtype: str
- querynator.report_scripts.create_report.get_sources(row)[source]
Get the Knowledgebases containing information about the variant
- Parameters:
row (pandas DataFrame row) – Row of the variant dataframe
- querynator.report_scripts.create_report.get_therapy_information_CGI(row, biomarkers_df, response, width_dict)[source]
Gets all associated disease names of a specific variant. :param row: The row of the dataframe. :type row: pandas.Series :param biomarkers_df: The biomarkers dataframe from CGI. :type biomarkers_df: pandas.DataFrame :param response: The response to a specific drug. :type response: str :param width_dict: A list containing the width of each column. :type width_dict: list :return: A pandas DataFrame containing all Therapy & Drug related information provided by CGI for a specific Protein Change. :rtype: pandas.DataFrame
- querynator.report_scripts.create_report.get_therapy_names(row, civic_only)[source]
Get the therapy names for the variant
- Parameters:
row (pandas DataFrame row) – Row of the variant dataframe
- querynator.report_scripts.create_report.remove_dups(row)[source]
Removes duplicates from string
- Parameters:
row (pd Series) – row of pd DataFrame
- Returns:
row without duplicates
- Return type:
pd Series
- querynator.report_scripts.create_report.retrieve_info_from_row(row, biomarkers_df, metadata_path)[source]
This function retrieves the information from a row of the merged dataframe and returns a dictionary with the information for the report of a specific variant.
- Parameters:
row (pandas.core.series.Series) – row of the dataframe
biomarkers_df (pandas.core.frame.DataFrame) – dataframe with all biomarkers linked to a specific variant
metadata_path (str) – The path to the metadata file.
- Returns:
dictionary with the information for the report of a specific variant
- Return type:
dict
- querynator.report_scripts.create_report.save_plot(input, title, out_path)[source]
Creates and saves a upsetplot figure as png
- Parameters:
input (pandas.DataFrame) – input dataframe
title (str) – title of the plot
out_path (str) – output path
- Returns:
matplotlib figure
- Return type:
matplotlib figure
- querynator.report_scripts.create_report.split_cols(col, col_name)[source]
splits specific string cols differently
- Parameters:
col – The column of the dataframe.
col_name – the name of the column
- Type:
col_name: str
- Returns:
Split column
- Return type:
pandas.Series
- querynator.report_scripts.create_report.write_individual_report(row, template_html, report_path, biomarkers_df, metadata_path)[source]
This function creates a report for a specific variant.
- Parameters:
row (pandas.core.series.Series) – row of the dataframe
template_html (str) – path to the template html file
report_path (str) – path to the individual report directory
biomarkers_df (pandas.core.frame.DataFrame) – dataframe with the biomarkers linked to the therapies
metadata_path (str) – path to the metadata file
- Returns:
None
- Return type:
None
- querynator.report_scripts.create_report.write_overall_report(template_html, report_html, fig_kb, fig_tiers, tier_table_list)[source]
Create the overall report for the querynator results
- Parameters:
template_html (str) – Path to the template html file
report_html (str) – Path to the report html file
fig_kb (str) – Path to upset plot of the kb distribution
fig_tiers (str) – Path to upset plot of the tier distribution
tier_table_list (list) – List of pretty-html-tables of the tier tables
- Returns:
None
- Return type:
None
Combine the results of the CGI query with the initial VEP annotation
- querynator.report_scripts.combine_cgi.combine_cgi(cgi_path, outdir, logger)[source]
Command to combine the cgi results with the vcf’s VEP annotation
- Parameters:
cgi_path (str) – Path to a CGI result folder generated using the querynator
outdir (str) – Path to report directory
- Returns:
None
- Return type:
None
- querynator.report_scripts.combine_cgi.extract_coords(row)[source]
extracts coordinates from the hgvs notation provided in “alterations.tsv”
- Parameters:
row (pandas Series) – row of a pandas DataFrame
- Returns:
extracted coordinates
- Return type:
pandas Series
- querynator.report_scripts.combine_cgi.get_all_alterations(row)[source]
extract only the alteration strings from the “Alterations” col in biomarkers.tsv
- Parameters:
row (pandas Series) – row of a pandas DataFrame
- Returns:
link of biomarker to all related alterations
- Return type:
list
- querynator.report_scripts.combine_cgi.get_highest_evidence(row, biomarkers_linked)[source]
get highest associated CGI evidence of the current alteration (A-D) from the biomarkers datafrane
- Parameters:
row (pandas Series) – row of a pandas DataFrame
biomarkers_linked (pandas DataFrame) – pd DataFrame of the projects “biomarkers.tsv”
- Returns:
highest associated evidence
- Return type:
str
- querynator.report_scripts.combine_cgi.link_biomarkers(biomarkers_df)[source]
add alteration-link column to “biomarkers.tsv”
- Parameters:
biomarkers_df (pandas DataFrame) – pd DataFrame of the projects “biomarkers.tsv”
- Returns:
DataFrame of biomarkers with additional alteration-link col
- Return type:
pandas DataFrame
- querynator.report_scripts.combine_cgi.merge_alterations_vep(vep_df, alterations_df)[source]
merge vep and CGI alterations annotations for each variant based on positional information (chr, pos, ref, alt)
- Parameters:
vep_df (pandas DataFrame) – DataFrame of variants and their VEP annotation
alterations_df (pandas DataFrame) – DataFrame of variants and their CGI alterations annotations
- Returns:
merged DataFrame of variants and their VEP & CGI alterations annotations
- Return type:
pandas DataFrame
- querynator.report_scripts.combine_cgi.read_filtered_vcf(filtered_vcf)[source]
Create a table containing the VEP annotation of each variant and positional information to connect to the alterations.tsv
- Parameters:
filtered_vcf (str) – Path to the project’s VEP filtered vcf
- Returns:
vep table
- Return type:
pandas DataFrame
- querynator.report_scripts.combine_cgi.read_modify_alterations(alterations_path)[source]
reads in and adds positional information to alterations file
- Parameters:
alterations_path (str) – Path to alterations file
- Returns:
None
- Return type:
None
- querynator.report_scripts.combine_cgi.remove_prefix(s, prefix)[source]
removes prefix of a string
- Parameters:
s (str) – string which prefix should be removed
prefix (str) – prefix to remove from string
- Returns:
string with prefix removed
- Return type:
str
- querynator.report_scripts.combine_cgi.subset_alterations(df)[source]
subset alterations file to only include relevant columns :param df: alterations DataFrame :type df: pandas DataFrame :return: subsetted alterations DataFrame :rtype: pandas DataFrame
Combine the results of the CIViC query with the initial VEP annotation
- querynator.report_scripts.combine_civic.combine_civic(civic_path, outdir, logger)[source]
Command to combine the civic results with the vcf’s VEP annotation
- Parameters:
civic_path (str) – Path to a CIViC result folder generated using the querynator
outdir (str) – Path to report directory
- Returns:
None
- Return type:
None
- querynator.report_scripts.combine_civic.merge_civic_vep(vep_df, civic_df)[source]
merge vep and civic annotation for each variant based on the Querynator ID
- Parameters:
vep_df (pandas DataFrame) – DataFrame of variants and their VEP annotation
civic_df (pandas DataFrame) – DataFrame of variants and their CIViC annotation
- Returns:
merged DataFrame of variants and their VEP & CIViC annotation
- Return type:
pandas DataFrame
- querynator.report_scripts.combine_civic.read_civic_results(civic_results)[source]
Read in the project’s CIViC annotation created by the querynator
- Parameters:
civic_results (str) – Path to the project’s CIViC annotation
- Returns:
DataFrame of CIViC resu
- Return type:
pandas DataFrame
- querynator.report_scripts.combine_civic.read_filtered_vcf(filtered_vcf)[source]
Create a table containing the VEP annotation of each variant
- Parameters:
filtered_vcf (str) – Path to the project’s VEP filtered vcf
- Returns:
vep table
- Return type:
pandas DataFrame
Combine CIViC-VEP with CGI-VEP
- querynator.report_scripts.combine_cgi_civic.combine_cgi_civic(outdir, logger)[source]
Combine cgi-vep table with the civic-vep table
- Parameters:
outdir (str) – Path to report directory
- Returns:
None
- Return type:
None
- querynator.report_scripts.combine_cgi_civic.merge_civic_cgi(alterations_vep, civic_vep)[source]
merge CIViC and CGI alterations annotations for each variant based on the similar variant VEP annotation
- Parameters:
alterations_vep (pandas DataFrame) – DataFrame of variants and their VEP and CGI alterations annotations
civic_vep (pandas DataFrame) – DataFrame of variants and their VEP and CIViC alterations annotations
- Returns:
merged DataFrame of variants and their VEP & CIViC & CGI alterations annotations
- Return type:
pandas DataFrame
Sort the variants into tiers (1-4) and provide a score for each variant to sort them within the tiers. The procedure is based on the Standards and Guideline provided by the AMP (Association for Molecular Pathology) and described by Li etal. (Li MM etal. Standards and Guidelines for the Interpretation and Reporting of Sequence Variants in Cancer: A Joint Consensus Recommendation of the Association for Molecular Pathology, American Society of Clinical Oncology, and College of American Pathologists. J Mol Diagn. 2017 Jan;19(1):4-23. doi: 10.1016/j.jmoldx.2016.10.002. PMID: 27993330; PMCID: PMC5707196.)
- querynator.report_scripts.sort_variants.add_tiers_and_scores_to_df(outdir, logger)[source]
Assigning variants from combined CGI-CIViC-VEP table to tiers and give them a score to rank them in these tiers
- Parameters:
outdir (str) – Path to report directory
- Returns:
None
- Return type:
None
- querynator.report_scripts.sort_variants.check_nan_in_pair(pair)[source]
check if one of the values is nan
- Parameters:
pair – list of values that are either string or nan
- Returns:
True, index of the nan value and index of the non-nan value
- Return type:
list
- querynator.report_scripts.sort_variants.extract_num(s)[source]
extracts the number from a string with pattern (e.g.) “benign(0.001)”
- Parameters:
s (str) – string to extract number from
- Returns:
extracted number
- Return type:
float
- querynator.report_scripts.sort_variants.generate_allele_freq_score(af, gnomad)[source]
generates the variantMTB score for the allele frequency of a specific variant
- Parameters:
af (str) – a variant’s associated allele frequencies
gnomad (str) – a variant’s associated gnomAD frequencies
- Returns:
allele frequency score
- Return type:
int
- querynator.report_scripts.sort_variants.generate_consequence_score(cgi_consequence, civic_consequence)[source]
generates the variantMTB score for the consequence of a specific variant
- Parameters:
cgi_consequence (str) – a variant’s associated consequence provided by CGI
civic_consequence (str) – a variant’s consequence as given by CIViC
- Returns:
consequence score
- Return type:
int
- querynator.report_scripts.sort_variants.generate_evidence_score(evidence_col)[source]
generates the variantMTB score for the evidence level of one of the Knowledgebases of a specific variant
- Parameters:
evidence_col – evidence value of one of the Knowledgebases
- Returns:
evidence score
- Return type:
int
- querynator.report_scripts.sort_variants.generate_pathogenicity_score_score(sift, polyphen)[source]
generates the variantMTB score for the pathogenicity scores (SIFT & PolyPhen2) of a specific variant
- Parameters:
af (str) – a variant’s associated allele frequencies
gnomad (str) – a variant’s associated gnomAD frequencies
- Returns:
allele frequency score
- Return type:
int
- querynator.report_scripts.sort_variants.get_allele_freq_tiering(row)[source]
checks if AF is < 0.01 to assign to tier 3 (true) or 4 (false) :param row: row of a pandas DataFrame :type row: pandas Series :return: True if AF < 0.01, False if AF > 0.01 :rtype: bool
- querynator.report_scripts.sort_variants.get_cgi_consequence_score(cgi_consequence)[source]
Scores the CGI consequence
- Returns:
CGI consequence score
- Return type:
int
- Parameters:
civic_consequence (str) – a variant’s consequence as given by CIViC
- querynator.report_scripts.sort_variants.get_civic_consequence_score(civic_consequence)[source]
Translates the CIViC consequence (variant type) (https://civic.readthedocs.io/en/latest/model/variants/types.html) nomenclature in the CGI/VEP nomenclature when possible and scores the variant
- Parameters:
civic_consequence (str) – a variant’s consequence as given by CIViC
- Returns:
CIViC consequence score
- Return type:
int
- querynator.report_scripts.sort_variants.get_consequence_score(consequence_str)[source]
Returns the highest score of the associated CIViC variant types for a variant
- Parameters:
consequence_str (str) – a variant’s consequence
- Returns:
consequence score
- Return type:
int
- querynator.report_scripts.sort_variants.get_largest_af(af_string)[source]
extracts the largest allele frequency from a string of allele frequencies. If single str is given, gives it out as int
- Parameters:
af_string (str) – comma-separated string of allele frequencies (af)
- Returns:
largest af in string
- Return type:
int
- querynator.report_scripts.sort_variants.get_largest_path_score(ps_string, ps_score)[source]
extracts the largest pathogenicity score from a string of pathogenicity scores. If single str is given, gives it out as int
- Parameters:
af_string (str) – comma-separated string of pathogenicity scores
ps_score (str) – respective pathogenicity score (SIFT or PolyPhen2)
- Returns:
largest af in string
- Return type:
int
- querynator.report_scripts.sort_variants.get_min(e)[source]
returns minimum of an input
- Parameters:
e (str) – string or np.nan
- Returns:
min of string or np.nan
- Return type:
str