citation("protr")
now gives better output with
the BibTeX citation key. This is improved by adding the key
argument to the bibentry()
call in
inst/CITATION
(#49).vignette("protr")
to fix typos and grammar
issues. Updated images to use knitr::include_graphics()
chunks, resolving pkgdown 2.1.0 accessibility hints for missing alt text
(#50).crossSetSim()
now gains two new arguments
batches
and verbose
.
The batches
argument allows users to split the
similarity computations into multiple batches, which is useful when
dealing with a large number of sequences and limited RAM. The
verbose
argument enables progress updates during the
computation. This brings crossSetSim()
to feature parity
with parSeqSim()
. (thanks, @ofleitas, #41)
A new function crossSetSimDisk()
has been
implemented as a disk-based version of crossSetSim()
.
This function follows a similar approach to
parSeqSimDisk()
, where partial results from each batch are
cached on the hard drive and merged at the end. This allows for
processing larger protein sequence sets that may not fit into RAM
(#41).
crossSetSim()
is added for calculating pairwise
similarity between two sets of protein sequence based on sequence
alignment (thanks, @seb-mueller, #34).extractProtFP()
and
extractProtFPGap()
when index = NULL
(thanks,
@fcampelo,
#30).system.file()
usage to avoid
confusion (thanks, @jonalv, #31).batches
to
parSeqSim()
. The new argument supports breaking down the
pairwise similarity computation into smaller batches. This is useful
when you have a large number of protein sequences, enough number of CPU
cores, but not enough RAM to compute and hold all the pairwise
similarities in a single batch. Also, use the other new argument
verbose
to track the computation progress.parSeqSimDisk()
. Compared to the
in-memory version parSeqSim()
, this new function caches the
partial results in each batch to the hard drive and merges the results
together in the end. This could further reduce the memory usage for
parallel similarity computations involving a large number of protein
sequences.parGOSim()
that will create minor
numerical inconsistencies in results due to argument matching.twoGOSim()
and parGOSim()
to use
the latest GOSemSim
API for computing GO based semantic
similarity. Issues in the code examples are also fixed. We thank Denisa
Duma for the feedback.getUniProt()
.gap.opening
and
gap.extension
to parSeqSim()
, allowing more
flexible tuning of the sequence alignment for more types of amino acid
sequence data. We thank Dr. Maisa Pinheiro for the feedback.removeGaps()
for
removing/replacing gaps (-
) or any irregular characters
from protein sequences, to make them suitable for feature extraction or
sequence alignment based similarity computation. We thank Dr. Maisa
Pinheiro for the feedback.ifelse
conditioning (nanxstats/protr@3f6e106) for the distribution descriptor
in CTD. We thank Jielu Yan from the University of Macau for kindly
reporting this issue.curl -I -L
:
2015-12-28
extractCTDD()
.2014-12-18
2014-09-20
2014-06-20
LICENSE
file according to CRAN policies.readFASTA()
.getUniProt()
.protcheck()
.protseg()
.