Java is the new C: Alternatives to Executors when scheduling Tasks/Actors

Sunday, October 12, 2014

Alternatives to Executors when scheduling Tasks/Actors

Executors work well when fed with short units of stateless work, e.g.: division of a computation onto many CPU's. However they are sub-optimal to schedule jobs which are part of an ongoing, larger unit of work, e.g.: Scheduling messages of an actor or lightweight process.

Many Actor Frameworks (or similar message passing concurrency frameworks) schedule batches of messages using an Executor service. As the Executor service is not context aware, this spreads out processing of a single Actor/Task's messages across several threads/CPU's.

This can lead to many cache misses when accessing the state of an Actor/Process/Task as the processing thread changes frequently.
Even worse, this way CPU caches cannot stabilize as each new "Runnable" washes out cached memory of the previously processed task's.

A second issue arises when using busy-spin polling. If a framework reads its queues using busy-spin, it generates 100% CPU load for each processing Thread. So one would like to add a second thread/core only if absolutely required, so a one-thread-per-actor policy is not feasible.

With Kontraktor 2.0 I implemented a different scheduling mechanism, which achieves a horizontal scaling using a very simple metric to measure actual application CPU requirements, without randomly spreading processing of an actor onto different Cores's.

An actor has a fixed assignment to a workerthread ("DispatcherThread"). The scheduler periodically reschedules Actors by moving them to another worker if necessary.

Since overly sophisticated algorithms tend to introduce high runtime costs, actual scheduling is done in a very simple manner:

1) a Thread is observed as being overloaded, if the consume loop (pulling messages from the actor queue) has not been idle for N messages (currently N=1000).

2) If a Thread has been marked "overloaded", Actors with largest mailbox sizes are migrated to a newly started Thread (until #Threads == ThreadMax) as long SUM_QUEUED_MSG(Actors scheduled on Thread A) > SUM_QUEUED_MSG(Actors scheduled on newly created Thread B).

3) in case #Threads == ThreadMax, actors are rebalanced based on summation of queued messages and "overload" observations.

Problem areas:

If the processing time of messages vary a lot, summation of queued messages is misleading. An improvement would be to add a weight onto each message type by profiling periodically. A simpler option would be to give an Actor an additional weight to be multiplicated with its queue size.
For load bursts, there is a latency until all of the available CPU's are actually used.
The delay until JIT kicks in produces bad profiling data and leads to false scale ups (heals over time, so not that bad).

An experiment

Update: Woke up this morning and it came to my mind this experiment has a flaw, as jobs per workitem are scheduled in parallel for the executor tests, so I am probably measuring the effects of false sharing. Results below removed, check the follow up post.

Performance cost of adaptive Scaling

To measure cost of auto-scheduling vs. explicit Actor=>Thread assignment, I run the Computing-Pi Test (see previous posts).
These numbers do not show effects of locality as it compares explicit scheduling with automatic scheduling.

Test 1 manually assigns a Thread to each Pi computing Actor,
Test 2 always starts with one worker and needs to scale up automatically once it detects actual load.

(note example requires >= kontraktor2.0-beta-2, if the 'parkNanos' stuff is uncommented it scales up to 2-3 threads only)

The test is run 8 times with increasing thread_max by one with each run.

results:

Kontraktor Autoscale (always start with 1 thread, then scale up to N threads)

1 threads : 1527
2 threads : 1273
3 threads : 718
4 threads : 630
5 threads : 521
6 threads : 576
7 threads : 619
8 threads : 668

Kontraktor with dedicated assignment of threads to Actors (see commented line in source above)

1 threads : 1520
2 threads : 804
3 threads : 571
4 threads : 459
5 threads : 457
6 threads : 534
7 threads : 615
8 threads : 659

Conclusion #2

Differences in runtimes can be attributed mostly to the delay in scaling up. For deterministic load problems, prearranged Actor scheduling is more efficient ofc. However thinking of a server receiving varying request loads over time, automatic scaling is a valid option.

25 comments:

MadiCOctober 23, 2015 at 10:55 AM
Where I can found source of this test?
ReplyDelete
Replies
maheshOctober 9, 2018 at 11:11 AM
Devops is not a Tool.Devops Is a Practice, Methodology, Culture or process used in an Organization or Company for fast collaboration, integration and communication between Development and Operational Teams. In order to increase, automate the speed of productivity and delivery with reliability.

python training in bangalore
aws training in bangalore
artificial intelligence training in bangalore
data science training in bangalore
machine learning training in bangalore
hadoop training in bangalore
devops training in bangalore

ReplyDelete
Replies
Products 24August 12, 2019 at 4:19 PM
This comment has been removed by the author.
ReplyDelete
Replies
Products 24August 12, 2019 at 4:20 PM
This comment has been removed by the author.
ReplyDelete
Replies
Scott StayrisNovember 2, 2020 at 6:41 AM
Quickbooks is advanced accounting software for your business. This software has many premium features like Cash Flow Statement, Balance Sheet, and improved e-mail functionality through Microsoft outlook. Quickbooks File Restore
ReplyDelete
Replies
Dim4ksanJanuary 12, 2021 at 10:48 PM
It's tough times in general for everybody. Those were times I had. Everyone of them turned their tables on me. You see, I haven't had a loved one to support me, how painful it was to me in my life. It is very challenging. It is very difficult. I suggest a dating site if you have now the same condition as I did. In a bad case, the bride from ukraine rescued me, I met an Asian girl there, and for four years we're together, my life is totally different from the other half! We have twins, and I become the guy I love. Thank you so much for everything on this page!
ReplyDelete
Replies
mohitJune 22, 2021 at 1:49 PM
Great blog, continue the good work!
gst login
ReplyDelete
Replies
Sophia RobotSeptember 30, 2021 at 1:40 PM
Hello Sir I saw your blog, It was very nice blog, and your blog content is awesome, i read it and i am impressed of your blog, i read your more blogs, thanks for share this summary.
Learn to Recover Disabled Facebook Account
ReplyDelete
Replies
Jess RoseJanuary 19, 2022 at 3:54 PM
Great article!
Maybe, it seems to me, like I read the simple one few days ago.
ReplyDelete
Replies
casinositerankNovember 20, 2023 at 9:54 AM
It was so bravely and depth information. I feel so good read to your blog.
ReplyDelete
Replies
totosafesiteNovember 20, 2023 at 10:01 AM
You have brought up a very fantastic points, thankyou for the post.
ReplyDelete
Replies
gwolfNovember 20, 2023 at 10:01 AM
Hey, I am so thrilled I found this blog, Just like to say thanks
ReplyDelete
Replies
sportstoto365November 20, 2023 at 10:01 AM

Great article I enjoy reading your thoughts. More power to your website. Please visit mine too
ReplyDelete
Replies
gostopsiteNovember 20, 2023 at 10:02 AM

It is extremely nice to see the greatest details presented in an easy and understanding manner.
ReplyDelete
Replies
safetotositeMarch 5, 2024 at 4:21 AM
Major thanks for the post.Thanks Again. Keep writing.
ReplyDelete
Replies
casinositenetMarch 5, 2024 at 4:21 AM
I appreciate you sharing this blog article.Much thanks again. Want more.
ReplyDelete
Replies
casinositekimMarch 5, 2024 at 4:22 AM
Some truly nice and useful info on this site, too I conceive the layout has wonderful features.
ReplyDelete
Replies
mttotositeMarch 5, 2024 at 4:22 AM
Deference to article author, some good selective information.
ReplyDelete
Replies
sportstoto365March 5, 2024 at 4:22 AM
This is such a great resource that you are providing and you give it away for free.
ReplyDelete
Replies
파워볼실시간March 5, 2024 at 4:23 AM
Hello, just wanted to tell you, I liked this article
ReplyDelete
Replies
사설 토토사이트March 5, 2024 at 4:23 AM
The quality and quantity of work produced here are absolutely informative.
ReplyDelete
Replies
안전놀이터March 5, 2024 at 4:23 AM
It is a great website.. The Design looks very good.. Keep working like that!.
ReplyDelete
Replies
바카라사이트 추천March 5, 2024 at 4:23 AM
I get pleasure from, cause I found just what I used to be having a look for.
ReplyDelete
Replies
Uzmandan BilgilerMarch 12, 2024 at 12:05 AM
The caliber and volume of the content generated on this platform are truly enlightening. Gelecekten.Net
ReplyDelete
Replies
wordpress web development servicesApril 26, 2024 at 9:01 AM
A second issue arises when using busy-spin polling. If a framework reads its queues using busy-spin, it generates 100% CPU load for each processing Thread. So one would like to add a second thread/core only if absolutely required, so a one-thread-per-actor policy is not feasible.
ReplyDelete
Replies

Add comment