• Big Data Tools EAP 12 Is Out: Experimental Python Support and Search Function in Zeppelin Notebooks

      Update 12 of the Big Data Tools plugin for IntelliJ IDEA Ultimate, PyCharm Professional Edition, and DataGrip has been released. You can install it from the JetBrains Plugin Repository or from inside your IDE. The plugin allows you to edit Zeppelin notebooks, upload files to cloud filesystems, and monitor Hadoop and Spark clusters.


      In this release, we've added experimental Python support and global search inside Zeppelin notebooks. We’ve also addressed a variety of bugs. Let's talk about the details.


      Read more →
    • Russian microcontroller K1986BK025 based on the RISC-V processor core for smart electricity meters

      • Translation
      Welcome to RISC-V era!

      Solutions based on the open standard instruction set architecture RISC-V are currently increasing their presence on the market. Microcontrollers from Chinese colleagues are already in serial production; Microchip is offering interesting solutions with FPGA on board. The ecosystem of software and design tools for this architecture are also growing. Seeming previously unshaken leaders have more often found themselves in resale ads, while young startups attract multi-million investments. Milandr also got involved in this race and today began supplying interested companies with samples of its new K1986BK025 microcontroller based on the RISC-V processor core for electricity meters. Well here we go, pictures, characteristics and other information, as well as a little bit of hype under the cut.


      Read more →
    • Big / Bug Data: Analyzing the Apache Flink Source Code

        image1.png

        Applications used in the field of Big Data process huge amounts of information, and this often happens in real time. Naturally, such applications must be highly reliable so that no error in the code can interfere with data processing. To achieve high reliability, one needs to keep a wary eye on the code quality of projects developed for this area. The PVS-Studio static analyzer is one of the solutions to this problem. Today, the Apache Flink project developed by the Apache Software Foundation, one of the leaders in the Big Data software market, was chosen as a test subject for the analyzer.
        Read more →
      • A Book on the API Design

          This year, each of us seeks a special way to pass the time. I am writing a book, for example. A book about one thing I love dearly: the API. (You may read who am I and what expertise got in APIs in my LinkedIn profile.)


          I've just finished the first large section dedicated to the API design. You may read it online, or download either pdf or epub version, or take a look at the source code on Github.


          The book is distributed for free under a CC-BY-NC license. Enjoy!

        • Algorithms in Go: Sliding Window Pattern (Part II)

            https://s3-us-west-2.amazonaws.com/secure.notion-static.com/adf4f836-dc81-4a3d-8a84-9c1d9c81fd66/algo_-_Starting_Picture.jpg


            This is the second part of the article covering the Sliding Window Pattern and its implementation in Go, the first part can be found here.


            Let's have a look at the following problem: we have an array of words, and we want to check whether a concatenation of these words is present in the given string. The length of all words is the same, and the concatenation must include all the words without any overlapping. Would it be possible to solve the problem with linear time complexity?


            Let's start with string catdogcat and target words cat and dog.


            https://s3-us-west-2.amazonaws.com/secure.notion-static.com/a49a78c7-5177-401b-9d30-3f02d3d8db49/algo_-_Input_string.jpg


            two concat


            How can we handle this problem?

            Read more →
          • Ads
            AdBlock has stolen the banner, but banners are not teeth — they will be back

            More
          • The Rules for Data Processing Pipeline Builders


              "Come, let us make bricks, and burn them thoroughly."
              – legendary builders

              You may have noticed by 2020 that data is eating the world. And whenever any reasonable amount of data needs processing, a complicated multi-stage data processing pipeline will be involved.


              At Bumble — the parent company operating Badoo and Bumble apps — we apply hundreds of data transforming steps while processing our data sources: a high volume of user-generated events, production databases and external systems. This all adds up to quite a complex system! And just as with any other engineering system, unless carefully maintained, pipelines tend to turn into a house of cards — failing daily, requiring manual data fixes and constant monitoring.


              For this reason, I want to share certain good engineering practises with you, ones that make it possible to build scalable data processing pipelines from composable steps. While some engineers understand such rules intuitively, I had to learn them by doing, making mistakes, fixing, sweating and fixing things again…


              So behold! I bring you my favourite Rules for Data Processing Pipeline Builders.

              Read more →
            • Development of “YaRyadom” (“I’mNear”) application under the control of Vk Mini Apps. Part 1 .Net Core

              • Translation
              Application is developed in order to help people find their peers who share similar interests and to be able to spend some time doing what you like. The project is currently on the stage of beta-testing in the social network “VKontakte”. Right now I am in the process of fixing bugs and adding everything that is missing. I felt like I could use a bit of destruction and decided to write a little about the development. While I was writing, I decided to divide the text into different parts. Here we are going to pay more attention to backend nuances which I faced, and to everything that a user does not see.
              Read more →
            • cGit-UI — a web interface for Git Repositories

              • Tutorial

              cGit-UI — is a web interface for Git repositories. cGit-UI is based on CGI script written in С.


              This article covers installing and configuring cGit-UI to work using Nginx + uWsgi. Setting up server components is quite simple and practically does not differ from setting up cGit.


              cGit-UI supports Markdown files that are processed on the server side using the md4c library, which has proven itself in the KDE Plasma project. cGit-UI provides the ability to add site verification codes and scripts from systems such as Google Analytics and Yandex.Metrika for trafic analysis. Users who wonder to receive donations for his projects can create and import custom donation modal dialogs.


              Instead of looking at screenshots, it is better to look at the working site to decide on installing cGit-UI on your own server.

              Read more →
            • Configuring FT4232H using the ftdi_eeprom

              • Tutorial


              The FT4232H is USB 2.0 High speed to UART IC converter. The FT4232H has four UART ports and one USB port.


              By connecting EEPROM memory to this chip, you can set specific operating modes or change the manufacturer's data.


              Let's look at the example and configure FT4232H directly on a system running GNU/Linux. We will do this using the ftdi_eeprom.

              Read more →
            • Playing with Nvidia's New Ampere GPUs and Trying MIG


                Every time when the essential question arises, whether to upgrade the cards in the server room or not, I look through similar articles and watch such videos.


                Channel with the aforementioned video is very underestimated, but the author does not deal with ML. In general, when analyzing comparisons of accelerators for ML, several things usually catch your eye:


                • The authors usually take into account only the "adequacy" for the market of new cards in the United States;
                • The ratings are far from the people and are made on very standard networks (which is probably good overall) without details;
                • The popular mantra to train more and more gigantic models makes its own adjustments to the comparison;

                The answer to the question "which card is better?" is not rocket science: Cards of the 20* series didn't get much popularity, while the 1080 Ti from Avito (Russian craigslist) still are very attractive (and, oddly enough, don't get cheaper, probably for this reason).


                All this is fine and dandy and the standard benchmarks are unlikely to lie too much, but recently I learned about the existence of Multi-Instance-GPU technology for A100 video cards and native support for TF32 for Ampere devices and I got the idea to share my experience of the real testing cards on the Ampere architecture (3090 and A100). In this short note, I will try to answer the questions:


                • Is the upgrade to Ampere worth it? (spoiler for the impatient — yes);
                • Are the A100 worth the money (spoiler — in general — no);
                • Are there any cases when the A100 is still interesting (spoiler — yes);
                • Is MIG technology useful (spoiler — yes, but for inference and for very specific cases for training);
                Read more →
              • Russian AI Cup 2020 — a new strategy game for developers



                  This year, many processes transformed, with traditions and habits being modified. The rhythm of life has changed, and there's more uncertainty and strain. But IT person's soul wants diversity, and many developers have asked us if annual Russian AI Cup will be held this year. Is there going to be an announcement? What is the main theme of the upcoming championship? Should I take a vacation?

                  Though some changes are expected, it will be held in keeping with the best traditions. In the run-up, we will announce one of today's largest online AI programming championships — Russian AI Cup. We invite you to make history!
                  Read more →
                • Vital Characteristics Of The Best Webflow Designers

                    A great website serves as the main key to hit success. Enhancing the online presence of your brand is absolutely a must nowadays. Technological advancement has changed the global business landscape. Hence, it is important to have a webflow developer who will take charge of the codes to be used in website templates and designs. Webflow is a website tool. It is a flexible platform that is geared to create a homogenous biz site. The use of this tool will definitely pave the way for your brand to excel on the web.
                    Read more →
                  • The Code Analyzer is wrong. Long live the Analyzer

                      Foo(std::move(buffer), line_buffer - buffer.get());

                      Combining many actions in a single C++ expression is a bad practice, as such code is hard to understand, maintain, and it is easy to make mistakes in it. For example, one can instill a bug by reconciling different actions when evaluating function arguments. We agree with the classic recommendation that code should be simple and clear. Now let's look at an interesting case where the PVS-Studio analyzer is technically wrong, but from a practical point of view, the code should still be changed.
                      Read more →
                    • How static code analysis helps in the GameDev industry

                        image1.png

                        The gaming industry is constantly evolving and is developing faster than a speeding bullet. Along with the growth of the industry, the complexity of development also increases: the code base is getting larger and the number of bugs is growing as well. Therefore, modern game projects need to pay special attention to the code quality. Today we will cover one of the ways to make your code more decent, which is static analysis, as well as how PVS-Studio in practice helps in the game project development of various sizes.
                        Read more →
                      • Analyzing the Code Quality of Microsoft's Open XML SDK

                          image1.png

                          My first encounter with Open XML SDK took place when I was looking for a library that I could use to create some accounting documents in Word. After more than 7 years of working with Word API, I wanted to try something new and easier-to-use. That's how I learned that Microsoft offered an alternative solution. As tradition has it, before our team adopts any program or library, we check them with the PVS-Studio analyzer.
                          Read more →
                        • How to build a high-performance application on Tarantool from scratch

                          • Tutorial
                          image

                          I came to Mail.ru Group in 2013, and I required a queue for one task. First of all, I decided to check what the company had already got. They told me they had this Tarantool product, and I checked how it worked and decided that adding a queue broker to it could work perfectly well.

                          I contacted Kostja Osipov, the senior expert in Tarantool, and the next day he gave me a 250-string script that was capable of managing almost everything I needed. Since that moment, I have been in love with Tarantool. It turned out that a small amount of code written with a quite simple script language was capable of ensuring some totally new performance for this DBMS.

                          Today, I’m going to tell you how to instantiate your own queue in Tarantool 2.2.
                          Read more →
                        • Modern Web-UI for SVN repositories

                          • Tutorial

                          cSvn — is a web interface for Subversion repositories. cSvn is based on CGI script written in С.


                          This article covers installing and configuring cSvn to work using Nginx + uWsgi. Setting up server components is quite simple and practically does not differ from setting up cGit.


                          cSvn supports Markdown files that are processed on the server side using the md4c library, which has proven itself in the KDE Plasma project. cSvn provides the ability to add site verification codes and scripts from systems such as Google Analytics and Yandex.Metrika for trafic analysis. Users who wonder to receive donations for his projects can create and import custom donation modal dialogs.


                          Instead of looking at screenshots, it is better to look at the working site to decide on installing cSvn on your own server.


                          It should be noted that you can browse not only your own repositories, but also configure viewing of third-party resources via HTTPS and SVN protocols.

                          Read more →