Machine learning *

The basis of artificial intelligence

Quantitative‌ ‌Funds:‌ ‌What’s‌ ‌Interesting‌ ‌for‌ ‌Coders?

Luxoft corporate blog Big Data *Machine learning *IT career Finance in IT

Hello, Habr! Not only traders but also mathematicians and programmers work with stock markets. Director of Engineering at Luxoft Artem Sosulnikov tells about data, which specialists of quantitative hedge funds work with, things they pay attention to, and conditions in such companies.

626

man_of_letters 22 July at 09:03

Mode on: Comparing the two best colorization AI's

RUVDS.com corporate blog Python *Image processing *Machine learning *TensorFlow *

This article continues a series of notes about colorization. During today's experiment, we’ll be comparing a recent neural network with the good old Deoldify to gauge the rate at which the future is approaching.

This is a practical project, so we won’t pay extra attention to the underlying philosophy of the Transformer architecture. Besides, any attempt to explain the principles of its operation to a wide public in hand waving terms would become misguiding.

A lecturer: Mr. Petrov! How does a transformer work?
Petrov with a bass voice: Hum-m-m-m.

Google Colorizing Transformer vs Deoldify

+17

815

m31 1 July at 16:40

Data Phoenix Digest — 01.07.2021

Python *Algorithms *Big Data *Machine learning *Artificial Intelligence

We at Data Science Digest have always strived to ignite the fire of knowledge in the AI community. We’re proud to have helped thousands of people to learn something new and give you the tools to push ahead. And we’ve not been standing still, either.

Please meet Data Phoenix, a Data Science Digest rebranded and risen anew from our own flame. Our mission is to help everyone interested in Data Science and AI/ML to expand the frontiers of knowledge. More news, more updates, and webinars(!) are coming. Stay tuned!

The new issue of the new Data Phoenix Digest is here! AI that helps write code, EU’s ban on biometric surveillance, genetic algorithms for NLP, multivariate probabilistic regression with NGBoosting, alias-free GAN, MLOps toys, and more…

If you’re more used to getting updates every day, subscribe to our Telegram channel or follow us on social media: Twitter, Facebook.

-1

702

m31 24 June at 13:09

DataScience Digest — 24.06.21

Python *Algorithms *Big Data *Machine learning *Artificial Intelligence

The new issue of DataScienceDigest is here!

The impact of NLP and the growing budgets to drive AI transformations. How Airbnb standardized metric computation at scale. Cross-Validation, MASA-SR, AgileGAN, EfficientNetV2, and more.

If you’re more used to getting updates every day, subscribe to our Telegram channel or follow us on social media: Twitter, LinkedIn, Facebook.

667

m31 10 June at 12:48

DataScience Digest — 10.06.21

Python *Algorithms *Big Data *Machine learning *Artificial Intelligence

The new issue of DataScienceDigest is here!

Machine learning in healthcare, the top 10 TED talks on AI, fraud detection in Uber, DatasetGAN, Text-to-Image generation via transformers, and more…

590

m31 2 June at 23:42

DataScience Digest — 02.06.21

Python *Algorithms *Big Data *Machine learning *Artificial Intelligence

New issue of DataScienceDigest is here! OpenAI is launching a $100 million startup fund, Albumentations 1.0 has been released, lessons on ML platforms, image cropping on Twitter, and more.

590

m31 28 May at 14:29

DataScience Digest — 28.05.21

Python *Algorithms *Big Data *Machine learning *Artificial Intelligence

The new issue of Data Science Digest is here! Hop to learn about the latest news, articles, tutorials, research papers, and event materials on DataScience, AI, ML, and BigData. All sections are prioritized for your convenience. Enjoy!

272

alexpatel 26 April at 12:29

Flitter Your Business With AI Integrated Flutter App Development

Development of mobile applications *Machine learning *Software Artificial Intelligence Flutter *

Sandbox

As we all are aware of the fact that the digital market is heavily leaning towards a reliable UX-driven process, app development has become quite complex, especially for targeting the industry for mobile platforms.

For every organization, creating a product that is beneficial for their customer needs always comes up with a plethora of challenges.

From the technical point of time, there are various challenges that every business faces, including selecting the right platform for the app, the right technology stack or framework, and creating an app that fulfills the needs and expectations of customers.

Similarly, there are more challenges that every business faces and needs to cope with while creating its dream product.

So, what to do??

Well, what if I say that the answer to all your queries and questions is Flutter app development with Artificial Intelligence (AI) integration……

Surprised? Wondering how?

Well, AI in Flutter app development is one of the best advancements in the software market. The concept of AI was first introduced during the 20th century with loads of innovations and advancements that we are still integrating into our mobile app development.

But, what are Artificial Intelligence and Flutter app development?

1.2K

m31 21 April at 12:38

Data Science Digest — 21.04.21

Python *Algorithms *Big Data *Machine learning *Artificial Intelligence

Hi All,

I’m pleased to invite you all to enroll in the Lviv Data Science Summer School, to delve into advanced methods and tools of Data Science and Machine Learning, including such domains as CV, NLP, Healthcare, Social Network Analysis, and Urban Data Science. The courses are practice-oriented and are geared towards undergraduates, Ph.D. students, and young professionals (intermediate level). The studies begin July 19–30 and will be hosted online. Make sure to apply — Spots are running fast!

If you’re more used to getting updates every day, follow us on social media:

Telegram
Twitter
LinkedIn
Facebook

Regards,
Dmitry Spodarets.

388

gui_tar_gz 19 April at 15:53

Neural network Telegram bot with StyleGAN and GPT-2

Python *Machine learning *Artificial Intelligence Social networks and communities

The Beginning

So we have already played with different neural networks. Cursed image generation using GANs, deep texts from GPT-2 — we have seen it all.

This time I wanted to create a neural entity that would act like a beauty blogger. This meant it would have to post pictures like Instagram influencers do and generate the same kind of narcissistic texts. \

Initially I planned to post the neural content on Instagram but using the Facebook Graph API which is needed to go beyond read-only was too painful for me. So I reverted to Telegram which is one of my favorite social products overall.

The name of the entity/channel (Aida Enelpi) is a bad neural-oriented pun mostly generated by the bot itself.

One of the first posts generated by Aida

m31 15 April at 22:34

Data Science Digest — We Are Back

Python *Algorithms *Big Data *Machine learning *Artificial Intelligence

Hi All,

I have some good news for you…

Data Science Digest is back! We’ve been “offline” for a while, but no worries — You’ll receive regular digest updates with top news and resources on AI/ML/DS every Wednesday, starting today.

If you’re more used to getting updates every day, follow us on social media:

Telegram - https://t.me/DataScienceDigest
Twitter - https://twitter.com/Data_Digest
LinkedIn - https://www.linkedin.com/company/data-science-digest/
Facebook - https://www.facebook.com/DataScienceDigest/

And finally, your feedback is very much appreciated. Feel free to share any ideas with me and the team, and we’ll do our best to make Data Science Digest a better place for all.

Regards,
Dmitry Spodarets.

548

lukyanchikov 3 April at 15:09

Distributed Artificial Intelligence with InterSystems IRIS

InterSystems corporate blog Machine learning *Distributed systems *Artificial Intelligence

Author: Sergey Lukyanchikov, Sales Engineer at InterSystems

What is Distributed Artificial Intelligence (DAI)?

Attempts to find a “bullet-proof” definition have not produced result: it seems like the term is slightly “ahead of time”. Still, we can analyze semantically the term itself – deriving that distributed artificial intelligence is the same AI (see our effort to suggest an “applied” definition) though partitioned across several computers that are not clustered together (neither data-wise, nor via applications, not by providing access to particular computers in principle). I.e., ideally, distributed artificial intelligence should be arranged in such a way that none of the computers participating in that “distribution” have direct access to data nor applications of another computer: the only alternative becomes transmission of data samples and executable scripts via “transparent” messaging. Any deviations from that ideal should lead to an advent of “partially distributed artificial intelligence” – an example being distributed data with a central application server. Or its inverse. One way or the other, we obtain as a result a set of “federated” models (i.e., either models trained each on their own data sources, or each trained by their own algorithms, or “both at once”).

Distributed AI scenarios “for the masses”

We will not be discussing edge computations, confidential data operators, scattered mobile searches, or similar fascinating yet not the most consciously and wide-applied (not at this moment) scenarios. We will be much “closer to life” if, for instance, we consider the following scenario (its detailed demo can and should be watched here): a company runs a production-level AI/ML solution, the quality of its functioning is being systematically checked by an external data scientist (i.e., an expert that is not an employee of the company). For a number of reasons, the company cannot grant the data scientist access to the solution but it can send him a sample of records from a required table following a schedule or a particular event (for example, termination of a training session for one or several models by the solution). With that we assume, that the data scientist owns some version of the AI/ML mechanisms already integrated in the production-level solution that the company is running – and it is likely that they are being developed, improved, and adapted to concrete use cases of that concrete company, by the data scientist himself. Deployment of those mechanisms into the running solution, monitoring of their functioning, and other lifecycle aspects are being handled by a data engineer (the company employee).

524

snakers4 30 March at 06:33

High-Quality Text-to-Speech Made Accessible, Simple and Fast

Machine learning *Sound Natural Language Processing *

There is a lot of commotion in text-to-speech now. There is a great variety of toolkits, a plethora of commercial APIs from GAFA companies (based both on new and older technologies). There are also a lot of Silicon Valley startups trying to ship products akin to "deep fakes" in speech.

But despite all this ruckus we have not yet seen open solutions that would fulfill all of these criteria:

Naturally sounding speech;
A large library of voices in many languages;
Support for 16kHz and 8kHz out of the box;
No GPUs / ML engineering team / training required;
Unique voices not infringing upon third-party licenses;
High throughput on slow hardware. Decent performance on one CPU thread;
Minimalism and lack of dependencies. One-line usage, no builds or coding in C++ required;
Positioned as a solution, not yet another toolkit / compilation of models developed by other people;
Not affiliated by any means with ecosystems of Google / Yandex / Sberbank;

We decided to share our open non-commercial solution that fits all of these criteria with the community. Since we have published the whole pipeline we do not focus much on cherry picked examples and we encourage you to visit our project GitHub repo to test our TTS for yourself.

chdan 25 February at 19:35

Chatbox on Top of SIEM Solution

Information Security *Machine learning *

One of the most time-consuming steps while implementing a SIEM solution is writing and tuning "Playbook" – a set of reaction procedures SOC Team has to follow in case of alert triggering.

So during one of our projects we stoped for a moment and thought: "How can we optimize (ideally automate) the Playbook?"

406

kryma 20 January at 12:16

Doing «Data Science» even if you have never heard the words before

Python *Algorithms *Mathematics *Machine learning *Artificial Intelligence

There’s a lot of talk about machine learning nowadays. A big topic – but, for a lot of people, covered by this terrible layer of mystery. Like black magic – the chosen ones’ art, above the mere mortal for sure. One keeps hearing the words “numpy”, “pandas”, “scikit-learn” - and looking each up produces an equivalent of a three-tome work in documentation.

I’d like to shatter some of this mystery today. Let’s do some machine learning, find some patterns in our data – perhaps even make some predictions. With good old Python only – no 2-gigabyte library, and no arcane knowledge needed beforehand.

Interested? Come join us.

858

fralik 14 January at 16:21

CLIP from OpenAI: what is it and how you can try it out yourself

Machine learning *

Neural networks (NN) and computer vision models in particular are known to perform well in specific tasks, but often fail to generalize to tasks they have not been trained on. A model that performs well on a food data may perform poorly on satellite images.

A new model from OpenAI named CLIP claims to close this gap by a large margin. The paper Open AI wrote presenting CLIP demonstrates how the model may be used on a various classification datasets in a zero-shot manner.

In this article, I will explain the key ideas of the model they proposed and show you the code to use it.

5.1K

snakers4 14 January at 10:09

Modern Portable Voice Activity Detector Released

Open source *Machine learning *Sound

Currently, there are hardly any high quality / modern / free / public voice activity detectors except for WebRTC Voice Activity Detector (link). WebRTC though starts to show its age and it suffers from many false positives.

Also in some cases it is crucial to be able to anonymize large-scale spoken corpora (i.e. remove personal data). Typically personal data is considered to be private / sensitive if it contains (i) a name (ii) some private ID. Name recognition is a highly subjective matter and it depends on locale and business case, but Voice Activity and Number Detection are quite general tasks.

Key features:

Modern, portable;
Low memory footprint;
Superior metrics to WebRTC;
Trained on huge spoken corpora and noise / sound libraries;
Slower than WebRTC, but fast enough for IOT / edge / mobile applications;
Unlike WebRTC (which mostly tells silence from voice), our VAD can tell voice from noise / music / silence;
PyTorch (JIT) and ONNX checkpoints;

Typical use cases:

Spoken corpora anonymization;
Can be used together with WebRTC;
Voice activity detection for IOT / edge / mobile use cases;
Data cleaning and preparation, number and voice detection in general;
PyTorch and ONNX can be used with a wide variety of deployment options and backends in mind;

1.5K

oduvan 28 December 2020 at 16:42

9 Reasons Why Students Don’t Want You as a Teacher

Programming *Machine learning *Studying in IT Social networks and communities Learning languages

Teaching is hard! Finding a way to explain ideas and concepts, finding an approach to each individual among your students, each having a unique mind and learning capabilities. Being patient and creative, friendly but respective, kind but fair. You have to understand complex stuff and be able to present them in the simplest of ways. There are so many things that you must balance and consider in your work. Teachers, you are heroes, the every-day heroes! With this heroic work comes a responsibility. A responsibility of keeping yourself accountable for your student’s education. Some teachers forget about that and stay oblivious to the mistakes they are making. We’ve compiled a list of 9 Reasons Why Students Don’t Want You as a Teacher. We sincerely hope that it will help you to self-reflect, better connect with your students and achieve better results during your lessons.

-1

snakers4 5 December 2020 at 12:55

Playing with Nvidia's New Ampere GPUs and Trying MIG

Image processing *Big Data *Machine learning *Computer hardware Natural Language Processing *

Every time when the essential question arises, whether to upgrade the cards in the server room or not, I look through similar articles and watch such videos.

Channel with the aforementioned video is very underestimated, but the author does not deal with ML. In general, when analyzing comparisons of accelerators for ML, several things usually catch your eye:

The authors usually take into account only the "adequacy" for the market of new cards in the United States;
The ratings are far from the people and are made on very standard networks (which is probably good overall) without details;
The popular mantra to train more and more gigantic models makes its own adjustments to the comparison;

The answer to the question "which card is better?" is not rocket science: Cards of the 20* series didn't get much popularity, while the 1080 Ti from Avito (Russian craigslist) still are very attractive (and, oddly enough, don't get cheaper, probably for this reason).

All this is fine and dandy and the standard benchmarks are unlikely to lie too much, but recently I learned about the existence of Multi-Instance-GPU technology for A100 video cards and native support for TF32 for Ampere devices and I got the idea to share my experience of the real testing cards on the Ampere architecture (3090 and A100). In this short note, I will try to answer the questions:

Is the upgrade to Ampere worth it? (spoiler for the impatient — yes);
Are the A100 worth the money (spoiler — in general — no);
Are there any cases when the A100 is still interesting (spoiler — yes);
Is MIG technology useful (spoiler — yes, but for inference and for very specific cases for training);

1.8K

sismetanin 5 November 2020 at 14:13

Toxic Comments Detection in Russian

Mail.ru Group corporate blog Python *Machine learning *Social networks and communities

Currently, social network sites tend to be one of the major communication platforms in both offline and online space. Freedom of expression of various points of view, including toxic, aggressive, and abusive comments, might have a long-term negative impact on people’s opinions and social cohesion. As a consequence, the ability to automatically identify and moderate toxic content on the Internet to eliminate the negative consequences is one of the necessary tasks for modern society. This paper aims at the automatic detection of toxic comments in the Russian language. As a source of data, we utilized anonymously published Kaggle dataset and additionally validated its annotation quality. To build a classification model, we performed fine-tuning of two versions of Multilingual Universal Sentence Encoder, Bidirectional Encoder Representations from Transformers, and ruBERT. Finetuned ruBERT achieved F₁ = 92.20%, demonstrating the best classification score. We made trained models and code samples publicly available to the research community.

+16

3.9K

2 3 4 5

Machine learning *

Quantitative‌ ‌Funds:‌ ‌What’s‌ ‌Interesting‌ ‌for‌ ‌Coders?

Mode on: Comparing the two best colorization AI's

Google Colorizing Transformer vs Deoldify

Data Phoenix Digest — 01.07.2021

DataScience Digest — 24.06.21

DataScience Digest — 10.06.21

DataScience Digest — 02.06.21

DataScience Digest — 28.05.21

Flitter Your Business With AI Integrated Flutter App Development

Data Science Digest — 21.04.21

Neural network Telegram bot with StyleGAN and GPT-2

The Beginning

Data Science Digest — We Are Back

Distributed Artificial Intelligence with InterSystems IRIS

High-Quality Text-to-Speech Made Accessible, Simple and Fast

Chatbox on Top of SIEM Solution

Doing «Data Science» even if you have never heard the words before

CLIP from OpenAI: what is it and how you can try it out yourself

Modern Portable Voice Activity Detector Released

9 Reasons Why Students Don’t Want You as a Teacher

Playing with Nvidia's New Ampere GPUs and Trying MIG

Toxic Comments Detection in Russian

Authors' contribution

Your account

Sections

Information

Services