The article is an abstract of my book [1] based on previously presented publications [2], [3], [4], [5]
Algorithms *
Everything about algorithms
Collective meaning recognition
The published material is in the Appendix of my book [1]
Modern civilization finds itself at a crossroads in which to choose the meaning of life. Because of the development of technology, the majority of the world's population may be "superfluous" - not in demand in the production of values. There is another option, where each person is a supreme value, an absolute individual and can be indispensably useful in the technology of the collective mind.
In the eighties of the last century, the task of creating a scientific field of "collective intelligence" was set. Collective intelligence is defined as the ability of the collective to find solutions to problems more effectively than each participant individually. The right collective mind must be...
Riddles of the fast Fourier transform
• The method of phase-magnitude interpolation (PMI)
• Accurate measure of frequency, magnitude and phase of signal harmonics
• Detection of resonances
The Fast Fourier Transform (FFT) algorithm is an important tool for analyzing and processing signals of various nature.
It allows to reconstruct magnitude and phase spectrum of a signal into the frequency domain by magnitude sample into the time domain, while the method is computationally optimized with modest memory consumption.
Although there is not losing of any information about the signal during the conversion process (calculations are reversible up to rounding), the algorithm has some peculiarities, which hinder high-precision analysis and fine processing of results further.
The article presents an effective way to overcome such "inconvenient" features of the algorithm.
Concordance of sense
In [1,2,3] texts (sign sequences with repetitions) were transformed (coordinated) into algebraic systems using matrix units as word images. Coordinatization is a necessary condition of algebraization of any subject area. Function (arrow) (7) in [1]) is a matrix coordinatization of text. One can perform algebraic operations with words and fragments of matrix texts as with integers, but taking into account the noncommutativity of multiplication of words as matrices. Structurization of texts is reduced to the calculation of ideals and categories of texts in matrix form.
The Ideal Economy
I am not an economist, but in light of current events with cryptocurrencies and the economy in general, I would like to share my thoughts on some kind of ideal economy, around which everything is happening now.
High-level pipelining in TL-Verilog, RISC-V from Imagination, formal tools and open-source EDA on ChipEXPO in Moscow
This year ChipEXPO conference in Moscow invited several Western speakers to present in English the emerging technologies in high-level HDLs, formal verification, open-source EDA and using industrual RISC-V cores for education. You can join these presentations on September 14-16 for free using this link (you may need to use google translate from Russian to go through the registration) https://eventswallet.com/en/events/282/
The whole program is here
The English-speaking presentations and tutorials include:
Doubling effective digitization frequency by multiple pass approach, is it possible?
As already described in the previous article, in the process of reworking the DSO138 oscilloscope toy, the idea arose in the DSO303 firmware at some point to try to double the maximum sampling frequency to achieve scanning times of 500 and 200 nanoseconds per cell. In fact, for the STM32F303, the theoretically maximum achievable sampling rate from the point of view of the ADC input, and this is determined by the minimum opening time of the ADC sampling unit, which in our case is 1.5 clock cycles x (1/72 MHz) = 20.8 nanoseconds, is 48 MSPS (millions of counts per second). However, with the parallel operation of 4 ADCs at 6 MHz, it is possible to achieve only 24 MSPS due to the limited speed of the ADC.
Let's imagine that we are considering correctly-periodic signal, which is also constant, i.e. it does not experience fluctuations in frequency and amplitude over time. Is it possible to somehow digitize it not in one, but in several passes, thereby increasing the effective sampling frequency?
Measuring Traffic Rate by Means of U-models
Measuring of stream rate in an artist's impression.
In one of our previous publications, we talked about a way to measure event stream rate using a counter based on exponential decay. It turns out that the idea of such a counter has an interesting generalization. This paper by Artem Shvorin and Dmitry Kamaldinov, Qrator Labs, reveals it.
Data Phoenix Digest — 01.07.2021
We at Data Science Digest have always strived to ignite the fire of knowledge in the AI community. We’re proud to have helped thousands of people to learn something new and give you the tools to push ahead. And we’ve not been standing still, either.
Please meet Data Phoenix, a Data Science Digest rebranded and risen anew from our own flame. Our mission is to help everyone interested in Data Science and AI/ML to expand the frontiers of knowledge. More news, more updates, and webinars(!) are coming. Stay tuned!
The new issue of the new Data Phoenix Digest is here! AI that helps write code, EU’s ban on biometric surveillance, genetic algorithms for NLP, multivariate probabilistic regression with NGBoosting, alias-free GAN, MLOps toys, and more…
If you’re more used to getting updates every day, subscribe to our Telegram channel or follow us on social media: Twitter, Facebook.
DataScience Digest — 24.06.21
The new issue of DataScienceDigest is here!
The impact of NLP and the growing budgets to drive AI transformations. How Airbnb standardized metric computation at scale. Cross-Validation, MASA-SR, AgileGAN, EfficientNetV2, and more.
If you’re more used to getting updates every day, subscribe to our Telegram channel or follow us on social media: Twitter, LinkedIn, Facebook.
Hashing
It is an efficient searching technique. Searching is a widespread operation on any data structure. Hashing is used to search specific records from a large domain of records. If we can efficiently search a record out of many records, we easily perform different operations on that data. Hashing is storing and retrieving data from the database in the order of O(1) time. We can also call it the mapping technique because we try to map smaller values into larger values by using hashing.
Following are the significant terminologies related to hashing
Search Key
In the database, we usually perform searching with the help of some keys. These keys are called search keys. If we take the student data, then we search by some registration number. This registration number is a search key. We have to put the search keys in hash tables.
Hash Table
It is a data structure that provides us the methodology to store data properly. A hash table is shown below. It is similar to an array. And we have indexes in it also like an array.
Binary Search
Searching is the method to search for a specific element in a group of an element. It needs a unique identity that is associated with the desired element. As a unique identity of that desired element has been found, the index to that desired element is returned. The index indicates the location or address where that specific element has been found in the list elements of the array. If the desired data is found, particular data has been returned. Otherwise, it returned a null value.
There are several categories of search algorithms such as.
DataScience Digest — 10.06.21
The new issue of DataScienceDigest is here!
Machine learning in healthcare, the top 10 TED talks on AI, fraud detection in Uber, DatasetGAN, Text-to-Image generation via transformers, and more…
DataScience Digest — 02.06.21
New issue of DataScienceDigest is here! OpenAI is launching a $100 million startup fund, Albumentations 1.0 has been released, lessons on ML platforms, image cropping on Twitter, and more.
Dictionary/Map
A basic data structure in computer science is the “associative array” known as a “map”. This structure is called a “dictionary”. Dictionaries are being used when you have key-value pairs of the information. Inputs are called keys, and outputs are called values. A dictionary is the abstract data type that can store elements so that they can be positioned quickly by using keys. Dictionary is like a container that will have a searchable assortment of items. Each item in the dictionary is stored as a key-value pair. In a dictionary, we can store multiple items with the same key.
Dictionary consists of multiple elements in terms of key and value pair. Both key and value are considered as one single pair. This is called mapping. Elements of the dictionary are enclosed in curly brackets in terms of key and value pairs. Dictionaries enable us to work with key-value pairs. Key-value pairs are two linked values where the key is the unique identifier where we can discover our data and the value is that the information.
Dictionary maps key-value pairs. It is a collection data type that has key-value pairs. A dictionary does not contain any duplicate members.
It is unordered and stores data values like a map. Thus, it is similar to the real-life dictionary with distinct key values. In a dictionary, we use keys as indexes to access elements.
The dictionary helps us to organize the collection of data. It is a special data type. Its syntax is:
Binary Tree
Data structures are classified into linear and non-linear data structures. A tree is a non-linear data structure. Data is stored hierarchically in a non-linear data structure. So the tree is a way of organizing data hierarchically. A tree grows from top to bottom. In a tree, there are different kinds of nodes that are linked with each other. A tree consists of the following elements:
Graph
It is a collection of edges and vertices. It can be used to display any form of network.
The following graph contains a total of 4 vertices and 5 edges. In this graph, vertices are A, B, C and D while edges are AB, BD, DC, CA and AD. Vertices are also known as nodes. The line connecting these vertices is the edge. Vertices are like objects and edges indicate the relation between those vertices. Data is stored in nodes. This data can be of numerical data type or any other data structure.
Recursion
Recursion is a strategy that algorithms use to solve specific problems. A recursive algorithm is an algorithm that solves the main problem by using the solution of a simpler sub-problem of the same type. Recursion is a particular way of solving a problem by having a function calling itself repeatedly. It is always applied to a function only. By using recursion, we can reduce the size of the program or source code. In recursion, a function invokes itself. And the function that invokes itself is referred to as a recursive function.
Suppose we have a user-defined function named ‘recursion’, and it will be written in the main function. The compiler will execute the recursion function automatically, and it will search for a particular function definition. This function definition will be executed, and control will go back to the main function. If we call the same function inside the function definition, then the compiler will move on to function definition first. When the compiler executes the recursion function, we will be calling the same function.
Overview of Morris's counters
On implementing streaming algorithms, counting of events often occurs, where an event means something like a packet arrival or a connection establishment. Since the number of events is large, the available memory can become a bottleneck: an ordinary -bit counter allows to take into account no more than events.
One way to handle a larger range of values using the same amount of memory would be approximate counting. This article provides an overview of the well-known Morris algorithm and some generalizations of it.
Another way to reduce the number of bits required for counting mass events is to use decay. We discuss such an approach here [3], and we are going to publish another blog post on this particular topic shortly.
In the beginning of this article, we analyse one straightforward probabilistic calculation algorithm and highlight its shortcomings (Section 2). Then (Section 3), we describe the algorithm proposed by Robert Morris in 1978 and indicate its most essential properties and advantages. For most non-trivial formulas and statements, the text contains our proofs, the demanding reader can find them in the inserts. In the following three sections, we outline valuable extensions of the classic algorithm: you can learn what Morris's counters and exponential decay have in common, how to improve the accuracy by sacrificing the maximum value, and how to handle weighted events efficiently.
DataScience Digest — 28.05.21
The new issue of Data Science Digest is here! Hop to learn about the latest news, articles, tutorials, research papers, and event materials on DataScience, AI, ML, and BigData. All sections are prioritized for your convenience. Enjoy!
Authors' contribution
-
alizar 2691.5 -
ZlodeiBaal 1426.0 -
agorkov 1345.0 -
Fil 1280.0 -
Leono 1086.0 -
YUVladimir 1037.0 -
valemak 1014.0 -
mephistopheies 996.0 -
haqreu 958.0 -
Zalina 922.0