(testing signal)

Tag: hashing

Hashing method slashes cost of implementing differential privacy — ScienceDaily

Rice University computer scientists have discovered an inexpensive way for tech companies to implement a rigorous form of personal data privacy when using or sharing large databases for machine learning.
“There are many cases where machine learning could benefit society if data privacy could be ensured,” said Anshumali Shrivastava, an associate professor of computer science at Rice. “There’s huge potential for improving medical treatments or finding patterns of discrimination, for example, if we could train machine learning systems to search for patterns in large databases of medical or…

Text Similarity using K-Shingling, Minhashing and LSH(Locality Sensitive Hashing)

Natural Language ProcessingText Similarity using K-Shingling, Minhashing, and LSH(Locality Sensitive Hashing)Text similarity plays an important role in Natural Language Processing (NLP) and there are several areas where this has been utilized extensively. Some of the applications include Information retrieval, text categorization, topic detection, machine translation, text summarization, document clustering, plagiarism detection, news recommendation, etc. encompassing almost all domains.But…

Learning to hash

Mapping books to different slots on the shelf. Photo by Sigmund on UnsplashHow to design data representation techniques with applications to fast retrieval tasksHashing is one of the most fundamental operations in data management. It allows fast retrieval of data items using a small amount of memory. Hashing is also a fundamental algorithmic operation with rigorously understood theoretical properties. Virtually all advanced programming languages provide libraries for adding and retrieving…

Anonymise Sensitive Data in a Pandas DataFrame Column with hashlib

Stop sharing personally identifiable information in your DataFrames

Photo by Markus Spiske on Unsplash

A common scenario encountered by Data Scientists is sharing data with others. But what should you do if that data contains personally identifiable information (PII) such as email addresses, customer IDs or phone numbers?

A simple solution is to remove these fields before sharing the data. However, your analysis may rely on having the PII data. For example, customer IDs in an e-commerce transactional dataset are necessary to know which customer bought which product.

Instead, you can anonymise the PII fields in your data using hashing.

Hashing is a one-way process of transforming a string of plaintext characters into a unique string of fixed length.


12. Crypto-craze,, A Flavor of PrimeNet

In case you’ve missed it, there has been a tremendous number of news stories, social media posts and the like on Bitcoin, Hashing Algorithms, Blockchain, video graphics cards and Crypto-mining.  If you are anything like the most of us, the information barely provides you a platform to have a discussion about the topic.  But what does it all mean?  What is a Blockchain?  What are hashing algorithms?  How does one mine for bitcoins or any other crypto-currencies?  Is it as profitable as most say?  These and many other questions will be addressed in this blog.

PrimeNet – For the past few years, I’ve really been intrigued with the application of prime numbers in public key encryption algorithms.  As a result, I decided to join a community of mathematicians in search of the largest prime number.


Hashing real stuff

Hashing real pictures (not QR codes) is a very tricky and interesting problem; similar to the one Google uses for ‘similar images’ search. This algorithm its called “Locality-sensitive hashing”.


Perceptual hashing is the use of an algorithm that produces a snippet or fingerprint of various forms of multimedia.[1][2] Perceptual hash functions are analogous if features are similar, whereas cryptographic hashing relies on the avalanche effect of a small change in input value creating a drastic change in output value.


What is a DAO?

A decentralized autonomous organization (DAO), sometimes labeled a decentralized autonomous corporation (DAC), is an organization that is run through rules encoded as computer programs called “smart contracts”, and whose financial transaction record and program rules are maintained on a blockchain database.

Its goals is to code the rules and decision making apparatus of an organisation, eliminating the need of documents and people for its management, and creating a decentralized, controled structure.

DAOs are basically  abstractions used to collaborate, and mechanisms that align economical incentives in the internet through software. DAOs allow humans to collaborate on a large escale for a common objective, and conciliate incentives among individuals that don’t know each other, and without the need for a third trust party.… Read more...