Integrating data from multiple sources with the aim to identify records that correspond to the same entity is required in many real-world applications including healthcare, national security, and businesses. However, privacy and confidentiality concerns impede the sharing of personal identifying values to conduct linkage across different organizations. Privacy-preserving record linkage (PPRL) techniques have been developed to tackle this problem by performing clustering based on the similarity between encoded record values, such that each cluster contains (similar) records corresponding to one single entity. When employing PPRL on databases from multiple parties, one major challenge is the prohibitively large number of similarity comparisons required for clustering, especially when the number and size of databases are large. While there have been several private blocking methods proposed to reduce the number of comparisons, they fall short in providing an efficient and effective solution for linking multiple large databases. Further, all of these methods are largely dependent on data. In this paper, we propose a novel private blocking method for efficiently linking multiple databases by exploiting the data characteristics in the form of probabilistic signatures and introduce a local blocking evaluation step for validating blocking methods without knowing the ground-truth. Experimental results show the efficacy of our method in comparison to several state-of-the-art methods.
Contributors:Marc Schulder, Yury Bakanouski
ATC-Anno is an annotation tool for the transcription and semantic annotation of air traffic control utterances.
It was developed at the Spoken Language Systems (LSV) group at Saarland University.
The latest version of the tool can always be found on the LSV GitHub account.
If you use the tool in your research, please cite the associated paper:
Marc Schulder, Johannah O'Mahony, Yury Bakanouski, Dietrich Klakow (2020). ATC-Anno: Semantic Annotation for Air Traffic Control with Assistive Auto-Annotation. In Proceedings of the International Conference on Language Resources and Evaluation (LREC), Marseilles, France.
Contributors:Bullen, Jay C
MATLAB codes used to model arsenic(III) remediation using a composite TiO2-Fe2O3 sorbent in batch and continuous-flow systems, using a modified form of the pseudo-second order (PSO) adsorption kinetic model.
This data supports the manuscript provisionally titled 'A kinetic adsorption model to inform the design of arsenic(III) treatment plants using photocatalyst-sorbent materials'
This new release ( cWB-6.3.1 ) is a minor upgrade of the first public version of cWB ( cWB-6.3.0 ): it fixes minor problems with the previous version and it introduces some new functionalities in the cwb_gwosc command. It is still fully compatible (i.e. in terms of results) with the version ( wat-6.2.6 ) used for the analysis of the LIGO and Virgo data collected during the Second Observational run O2.
See https://gwburst.gitlab.io/ for more details.
Public git repository: https://gitlab.com/gwburst/public/library
Contributors:Juniper L. Simonis
Tools for interacting with the publicly available California Delta Fish Salvage Database, including continuous deployment of data access, analysis, and presentation.
Contributors:denniscfeng, Teddy Tran, Sera Yang, Théo Bodrito, Stefan van der Walt
Initial release, doesn't work well on some butterflies
NIfTI-Studio is a Matlab toolbox enabling researchers to visualize 3D NIfTI and Analyze images. Users can flip through slices, change orientations, 3D render surfaces, plot mosaics of slices, or add statistical overlays. NIfTI-Studio also allows for the manual creation of regions of interest (ROIs) to support additional analyses. The toolbox features an intuitive user interface that is easy to learn, but users also have the option to perform nearly all tasks via command line calls. The toolbox was designed for use with MRI brain imaging data, but it should be compatible with any NIfTI or Analyze format images.