How do people usually write a server if they don't really care about performance? A program starts, then starts accepting incoming connections from clients and starts a new thread for each client, which is engaged in servicing this client. If you use framework, like Spring or Flask or Poco there, then it does something like this inside itself - the only difference is the threads can be reused, that is, taken from a certain pool. It's all quite convenient, but not too effective (and Spring is bad). Most likely, your threads serving clients do not live long and most of the time they are waiting either to receive data from the client or to send it to the client - that is, they are waiting for some system calls to return. Creating an OS thread is quite an expensive operation, as is context switching between OS threads. If you want to be able to serve a lot of customers efficiently, you need to come up with something else. For example, callbacks, but they are pretty inconvenient (though there are different opinions on this).
Another option is to use non-blocking I/O in combination with some kind of implementation of user-space threads (fibers). In this article I will show you how to write all this with your own hands.