Skip to main content

Questions tagged [stream-processing]

Stream processing is consuming an input stream or sequence of bytes with some format, processing the data as it arrives, and translating it into another format in a related output stream.

Filter by
Sorted by
Tagged with
2 votes
2 answers
169 views

I'm working on a problem right now that processes incoming data at a very high rate. Each event that flows in has an association ID, and each group of associated events will affect behaviour over time ...
Kris's user avatar
  • 141
1 vote
3 answers
199 views

I am tasked to design a system that should receive data either as files or via an API and perform ETL functions. The end result is put into an RDBMS. For the sake of example, imagine a system that ...
Sharon Ben Asher's user avatar
3 votes
4 answers
1k views

I watched two YouTube videos where in: 1st one - there are concepts of batch processing and stream ingestion and in the 2nd one - there is a comparison between batch and real time processing. Is it ...
bridgemnc's user avatar
  • 261
1 vote
2 answers
231 views

I noticed how our codebase contains multiple versions of the same method, which unmarshalls the inbound byte-stream into java objects and that the only difference between the current and previous ...
Banana's user avatar
  • 141
-1 votes
1 answer
122 views

Outline of the current architecture of our web app outlining the issue I'm seeing Client-side app is React, talking to a server running the Play! framework via an API. On the page is a table that ...
bluedevil2k's user avatar
0 votes
0 answers
87 views

I have consumer pools that consume only events matching a specific attribute (e.g. origin=WINDOWS, the other origin=LINUX). Since ordering is important (ordering is based on instanceId, so this is the ...
user avatar
2 votes
4 answers
980 views

If streams in programming is about reading data in chunks, how then is it possible to process the data in the scenarios where the data cannot be processed bit by bit, but the processing logic needs ...
Finlay Weber's user avatar
3 votes
1 answer
994 views

Say that you have a monitoring application that reports downtimes for a large number of remote systems (think IOT). A monitoring daemon polls the remote system and reports that status (on/off) via an ...
bangnab's user avatar
  • 756
1 vote
2 answers
309 views

I want to create a stream class The input stream should read/parse a continuous range from left to right providing convenience methods. The implementation isn't a problem but choosing a consistence ...
Viatorus's user avatar
  • 119
1 vote
1 answer
111 views

I want to show a number of posts, ordered by their score, limited to some number. The score is the product of the number of likes they have (from another table) and the recipricol of their normalized (...
Isvara's user avatar
  • 630
0 votes
0 answers
81 views

In a case where we have multiple source of data, we need to : persist each source of data in its raw form. process the data source eventually with each other to transform them in a presentable state (...
Omegaspard's user avatar
-1 votes
1 answer
98 views

I have a collection of documents, which hold a subject id, a timestamp and a value. For example: { sid: 1, t: 3, v: "A" } { sid: 1, t: 5, v: "B" } Which means subject#1 is ...
uylmz's user avatar
  • 1,139
-3 votes
1 answer
437 views

I was watching youtube and squinted at the text in the video and wondered what happens when I switch from 720p to 1080p. Does the streaming service, Youtube and or any other, just have the files in ...
heretoinfinity's user avatar
0 votes
2 answers
85 views

I am drafting architecture for 1:1 video calling mobile app with face filters (face recognition). I could use p2p e.g. WebRTC because more savings less complexity. However, continuous face rec during ...
softcode's user avatar
  • 103
0 votes
1 answer
69 views

I have a system which generates pieces of data that needs to be tracked and assigned a specific TTYL value (a value in seconds) for how long they are supposed to live before being re-processed. I ...
user391986's user avatar
0 votes
1 answer
455 views

Suppose there is a system ( like an ERP ) that writes to a database ( not too big, less than 100GB ). You need to export the data from this database to a data warehouse ( like RedShift or BigQuery ) ...
julianomontini's user avatar
0 votes
1 answer
193 views

I have a Buffer which wraps a stream of binary data. The first byte indicates order, either big endian or little endian, which is needed to unpack the data. class FooBuffer implements Buffer is ...
D. Shamray's user avatar
-1 votes
1 answer
66 views

I am facing an issue that I don't believe is novel but none the less am having trouble finding a solution that fits well with our system. We have a constant stream of events going into AWS Kinesis. ...
Osman's user avatar
  • 107
3 votes
3 answers
3k views

I'd like to match a regex pattern on a stream, but I am not sure what algorithm to use. I certainly don't want to load the entire file into memory. I tried to figure out how to do this, but I have ...
inf3rno's user avatar
  • 1,259
0 votes
1 answer
113 views

Problem A streaming application should perform matching transitively i.e. if A == B & B == C then A == C Current Implementation Application accepts domain objects in a streaming fashion and ...
Ammar's user avatar
  • 123
-2 votes
1 answer
3k views

I need to generate a large Excel file (something around 50 megs) and send response to another API which will provide it to the front end for a download option. My question is if it will be better to ...
JackTheKnife's user avatar
2 votes
1 answer
254 views

Are streams of binary data considered a form of bit banging? Does this definition change if the array is buffered? I am referring software which handles binary data on a general purpose CPU; for ...
Zhro's user avatar
  • 191
1 vote
1 answer
886 views

When would I use reactive programming libraries like RX Java and Project Reactor compared to stream processing engines such as Storm and Flink? I am aware that these concepts might not be directly ...
Hyggenbodden's user avatar
1 vote
1 answer
536 views

I am currently writing a Python program that retrieves the pixels from two BMP files and finds the percent difference between them (without using the Pillow library for the purpose of learning). I can ...
8ask714's user avatar
  • 23
5 votes
2 answers
1k views

Design Data Intensive Applications says Batch processing systems (offline systems) Chapter 10 A batch processing system takes a large amount of input data, runs a job to pro‐ cess it, and ...
Tim's user avatar
  • 5,565
0 votes
0 answers
376 views

So I have some naive implementation for the following problem: We have a list of objects with action methods that have to be trigger at some value of interest. This value is polled (or is streamed) ...
user59271's user avatar
  • 101
1 vote
0 answers
171 views

Use Case: I need to join two stream sources(say orders(order_id, order_val) and shipments(shipment_id, shipment_val)) based on an id(order_id = shipment_id) and generate a new event shipment_order(id, ...
Abhishek's user avatar
  • 129
2 votes
0 answers
472 views

Let us say we have: a web app with a Postgres DB that produces data over time, another DB optimized for analytics that we would like to populate over time. My goal is to build and monitor an ETL ...
sunless's user avatar
  • 151
0 votes
1 answer
403 views

There are X Million Shipments per day. Each shipment has about 50 metrics to be calculated. Each metric is calculated based on a type of the event(Let's say event_1 has the required information to ...
Abhishek's user avatar
  • 129
3 votes
0 answers
81 views

We currently have an application that is nominally written in Java/Spring boot, but all of the control flow is in NiFi. For example, there are the following layers in the java application: ...
soandos's user avatar
  • 313
3 votes
0 answers
528 views

The issue I am building a server-side system that handles streams of time-series events from multiple users and I'd like to perform an action after a certain quiet period has elapsed in the event ...
noamt's user avatar
  • 139
3 votes
1 answer
121 views

Lets say there are two services. One generates event A and the other event B. We need to build a new service that implements the function C = func(A, B) , which produces the result C. But as A and B ...
Avinav's user avatar
  • 139
0 votes
1 answer
186 views

We are working on an event driven system that works a streaming technology (eventhub/kinesis/kafka). Imagine some system is generating events that are sent to the event stream. Then there are ...
Martin Kosicky's user avatar
6 votes
1 answer
19k views

Scenario We are building a UI that allows users to query our data in bulk. The return format is CSV and there is a decent amount of processing that needs to happen, so this processing processing is ...
Jared Goguen's user avatar
3 votes
1 answer
693 views

I am trying to improve an event-driven processing system, which is having a few problems because the events are not guaranteed to arrive in the correct chronological sequence. This is due to batching ...
scipilot's user avatar
  • 141
3 votes
2 answers
3k views

I started studying functional programming with JavaScript. After this, I started to study it with Java 8 (streams, lambdas and method reference) and I realised that I tend to use streams as much as ...
Bruno Carneiro's user avatar
0 votes
1 answer
479 views

I am designing an interface for reading and writing video frames to various inputs and outputs. Stream operators seem to me a superb alternative to named functions for the task. This is the gist of it:...
Vorac's user avatar
  • 7,189
1 vote
1 answer
202 views

Why should someone use a stream processing engine like Apache Spark, Flink, Hadoop instead of just a normal backend worker which works on something and returns the results as soon as it's done? ...
Muhammad's user avatar
  • 399
11 votes
3 answers
7k views

I stumbled upon a question in Codereview, and in one answer the feedback was to avoid std::endl because it flushes the stream. The full quote is: I'd advise avoiding std::endl in general. Along with ...
klutt's user avatar
  • 1,448
-1 votes
1 answer
305 views

I have kafka consumer that get data from kafka and do some process if some circumstances meet and send it for another consumer to do it's job. How can i build something like this? One way I think ...
mohsenJsh's user avatar
  • 1,375
1 vote
1 answer
164 views

I need to build a data store for transactions that meets the following requirements: A 'transaction' is effectively a state machine which moves through a number of statuses during its lifecycle and ...
jimmy_terra's user avatar
0 votes
2 answers
309 views

So, there is a workflow I feel needs to exist. It may already exist, and I don't know about it. Consider the Linux command line workflow, the "motto" of Linux, more or less. You have a bunch of ...
Erhannis's user avatar
  • 145
-2 votes
1 answer
599 views

Assume You have some POJO Animal like public class Animal { // some fields } You have some enum AnimalType like public enum AnimalType { // some animal types } You have some method that given an ...
geofflittle's user avatar
0 votes
1 answer
209 views

I receive sensor data as a binary stream of bytes. This stream is not always the same length, and does not include the same data set each time. If the sensor did not send a field, it is simply absent, ...
foxtrotuniform6969's user avatar
2 votes
2 answers
289 views

I am facing an issue where I have a data stream that sends unordered data. I'm trying to find a way to receive the data in random order, but send it in order. As an example, I'll receive object4 and ...
Solidak's user avatar
  • 167
6 votes
2 answers
161 views

I'm working on a file-synchronization client that currently produces a stream of changes to the underlying filesystem. That is a stream of create/update/move/delete events is produced for each ...
Louis Thibault's user avatar
0 votes
1 answer
908 views

I am working on a Python 3 program that will among other things write data to text files in a number of formats. I have one function per format, generating the data to be written. I see two possible ...
Anders's user avatar
  • 1,361
-1 votes
1 answer
90 views

I'm having trouble planning the structure of my server side workflow and the technologies I should use. The basic structure and tasks are: Now, things to consider: 1.the server listens to multiple "...
mizenetofa1989's user avatar
1 vote
1 answer
666 views

Why is the sorted() method on Stream not named sort()? list.stream().sort()
benshqd's user avatar
  • 29
2 votes
2 answers
2k views

I browsed Wikipedia and encountered a new feature of Java 1.9: Reactive streams. Unfortunately, the linked Wikipedia page doesn't help much to understand what those reactive streams are and what's so ...
juhist's user avatar
  • 2,579