Integrating data from multiple sources with the aim to identify records that correspond to the same entity is required in many real-world applications including healthcare, national security, and businesses. However, privacy and confidentiality concerns impede the sharing of personal identifying values to conduct linkage across different organizations. Privacy-preserving record linkage (PPRL) techniques have been developed to tackle this problem by performing clustering based on the similarity between encoded record values, such that each cluster contains (similar) records corresponding to one single entity. When employing PPRL on databases from multiple parties, one major challenge is the prohibitively large number of similarity comparisons required for clustering, especially when the number and size of databases are large. While there have been several private blocking methods proposed to reduce the number of comparisons, they fall short in providing an efficient and effective solution for linking multiple large databases. Further, all of these methods are largely dependent on data. In this paper, we propose a novel private blocking method for efficiently linking multiple databases by exploiting the data characteristics in the form of probabilistic signatures and introduce a local blocking evaluation step for validating blocking methods without knowing the ground-truth. Experimental results show the efficacy of our method in comparison to several state-of-the-art methods.
Contributors:Marc Schulder, Yury Bakanouski
ATC-Anno is an annotation tool for the transcription and semantic annotation of air traffic control utterances.
It was developed at the Spoken Language Systems (LSV) group at Saarland University.
The latest version of the tool can always be found on the LSV GitHub account.
If you use the tool in your research, please cite the associated paper:
Marc Schulder, Johannah O'Mahony, Yury Bakanouski, Dietrich Klakow (2020). ATC-Anno: Semantic Annotation for Air Traffic Control with Assistive Auto-Annotation. In Proceedings of the International Conference on Language Resources and Evaluation (LREC), Marseilles, France.
Contributors:Bullen, Jay C
MATLAB codes used to model arsenic(III) remediation using a composite TiO2-Fe2O3 sorbent in batch and continuous-flow systems, using a modified form of the pseudo-second order (PSO) adsorption kinetic model.
This data supports the manuscript provisionally titled 'A kinetic adsorption model to inform the design of arsenic(III) treatment plants using photocatalyst-sorbent materials'
Contributors:Dominique, Alexis, Andreas Noack, Jiahao Chen, Julia TagBot
Diff since v0.3.0
Remove dependency on BinaryProvider (#24)
Merged pull requests:
Interface PROPACK for complex matrices (#22) (@amontoison)
Change Cirrus status link (#23) (@amontoison)
Release 0.3.1 (#25) (@amontoison)
Contributors:Juniper L. Simonis
Tools for interacting with the publicly available California Delta Fish Salvage Database, including continuous deployment of data access, analysis, and presentation.
Changes from the previous release: (diff)
added threading mode, which can be selected by new option mode
added brief sleeps in "while" loops, which improves performance in some circumstances
removed code for Python 2.7
Contributors:Casper da Costa-Luis, Stephen Karl Larroque, Hadrien Mary, Kyle Altendorf, Noam Yorav-Raphael, Mikhail Korobov, Ivan Ivanov, Marcel Bargull, Guangshuo CHEN, Mikhail Dektyarev, mjstevens777, Matthew D. Pagel, Martin Zugnoni, James, Charles Newey, Todd, Staffan Malmgren, Socialery, RedBug312, Orivej Desh, Max Nordlund, Jack McCracken, Hugo van Kemenade, FichteFoll, Fabian Dill, Daniel Panteleit, Alexander, Alex Rothberg, Albert Kottke, Adnan Umer
add automatic nrows and expose for manual override (#918 -> #924)
expose and warn about small chunksize in tqdm.contrib.concurrent.process_map (#912)
fix py2 file stream comparison (#727 -> #730)
add and update tests
Contributors:Cabral, Ariana Moura, Andrade, Adriano de Oliveira
Esta tabela tem como objetivo apresentar, de forma simples, o dimensionamento do consumo de oxigênio medicinal para o atendimento assistencial à saúde de leitos de Unidades de Terapia Intensiva – Adulto em consonância com as diretrizes técnicas previstas na norma ABNT NBR 12188 e as disposições legais presentes na RDC n º 50.