Unexpected doMethodLater task and Sequence Wait() timing

greenpanda · October 15, 2023, 7:06pm

import time
import panda3d.core
from direct.showbase.ShowBase import ShowBase
from direct.task import Task

base = ShowBase()

timeFromLastFrame = 0

def doMethodLater(task):
	global timeFromLastFrame
	
	currentTime = time.time()
	elapsedTime = currentTime - timeFromLastFrame
	print(elapsedTime)
	timeFromLastFrame = currentTime
	
	return task.again
	
base.taskMgr.doMethodLater(1/60, doMethodLater, "doMethodLater")

base.run()

I’d expect the task to run every 1/60 seconds, but the output log is very strange:

0.033344268798828125
0.03333926200866699
0.016874313354492188
0.03317904472351074
0.016629934310913086
0.0333251953125
0.03328251838684082
0.0167236328125
0.016646146774291992
0.033270835876464844
0.033332109451293945
0.01676487922668457
0.033345699310302734
0.032674312591552734
0.016810178756713867
0.017063379287719727
0.016716480255126953
0.033325910568237305
0.01668071746826172
0.03327798843383789
0.016716480255126953
0.03332853317260742
0.03332114219665527
0.016675233840942383
0.033351898193359375
0.01786661148071289
0.032109737396240234
0.016745328903198242
0.03325295448303223
0.016701698303222656
0.03330278396606445
0.016758441925048828
0.03318929672241211
0.0167386531829834
0.03331899642944336
0.03334999084472656
0.033368825912475586
0.0332491397857666
0.0172576904296875
0.032778263092041016
0.016712665557861328
0.03325390815734863
0.016722917556762695
0.03332257270812988
0.016674041748046875
0.01669907569885254
0.03332233428955078
0.016642093658447266
0.03333163261413574
0.03332662582397461
0.016921281814575195
0.03332710266113281
0.03311443328857422
0.03335428237915039
0.03328752517700195
0.016675949096679688
0.01673102378845215
0.03328204154968262
0.0333712100982666
0.033332109451293945
0.03329968452453613
0.01665639877319336
0.016652822494506836
0.03327345848083496
0.01674675941467285
0.016788721084594727
0.03318953514099121
0.016715049743652344
0.03320670127868652
0.0167083740234375
0.0167849063873291
0.033277273178100586
0.03327798843383789
0.016729116439819336
0.016688108444213867
0.03325533866882324
0.03338193893432617
0.016707420349121094
0.033222198486328125
0.016717910766601562
0.016697406768798828
0.033333778381347656
0.01666116714477539
0.03333592414855957
0.032968759536743164
0.0169827938079834
0.016723155975341797
0.03329801559448242
0.01673746109008789
0.03332328796386719
0.03358817100524902
0.032286882400512695
0.03357100486755371
0.01674818992614746
0.017148256301879883
0.03331279754638672
0.017984628677368164
0.0322420597076416
0.03321099281311035
0.016751766204833984
0.0333859920501709
0.0332949161529541
0.03373861312866211
0.03329205513000488
0.016692399978637695
0.03321385383605957
0.016788959503173828
0.03335404396057129
0.01812434196472168
0.031736135482788086

What is going on?
How can I be sure to run a function every 1/60 seconds, without using ordinary tasks? I don’t want certain functionality tied to the refresh rate, while it still being called in specific time intervals. I know getDeltaTime() can be used in some cases such as transforms to avoid certain functionality to be dependent of framerate dips or increase, but it’s not a perfect solution when you just need to perform other functionality in specific time intervals. I hoped doMethodLater tasks would aid here, but it seems like they are simply still ran every frame, merely a delay is applied based on the time you specified, rather than the delay being the time interval between which the functions will be called. Even if we ignore the 0.033 second values in the log, the 0.016 ones aren’t really accurate themselves.

Using Sequences with a function interval and Wait inside doesn’t do the trick either:

import time
import panda3d.core
from direct.showbase.ShowBase import ShowBase
from direct.interval.IntervalGlobal import *

base = ShowBase()

timeFromLastFrame = 0

def methodToRun():
	global timeFromLastFrame
	
	currentTime = time.time()
	elapsedTime = currentTime - timeFromLastFrame
	print(elapsedTime)
	timeFromLastFrame = currentTime
	
funcInterval = Func(methodToRun)
mySequence = Sequence(funcInterval, Wait(1/60))
mySequence.loop()

base.run()

0.036360979080200195
0.016874313354492188
0.016983747482299805
0.016647815704345703
0.01668071746826172
0.01659679412841797
0.016788721084594727
0.016569137573242188
0.016723155975341797
0.01656198501586914
0.033400535583496094
0.016652822494506836
0.016262054443359375
0.01653289794921875
0.01680755615234375
0.017047882080078125
0.0179440975189209
0.01542806625366211
0.016661643981933594
0.016650676727294922
0.016713380813598633
0.016632556915283203
0.01662158966064453
0.016775131225585938
0.016597509384155273
0.01668548583984375
0.016673564910888672
0.01665782928466797
0.01663827896118164
0.016702651977539062
0.016625642776489258
0.01668381690979004
0.01663804054260254
0.016623497009277344
0.01675701141357422
0.016595840454101562
0.01672840118408203
0.016658782958984375
0.01658153533935547
0.016684770584106445
0.016745805740356445
0.016600608825683594
0.016776084899902344
0.01656365394592285
0.0166776180267334
0.01668095588684082
0.016765832901000977
0.01653575897216797
0.016766786575317383
0.016631364822387695
0.016571998596191406
0.016719818115234375
0.01663661003112793
0.016740083694458008
0.01671886444091797
0.016659975051879883
0.01664900779724121
0.01661372184753418
0.017797470092773438
0.015555620193481445
0.03335404396057129

rdb · October 15, 2023, 7:31pm

Hi, welcome to the forums!

You should see the task manager as a list of tasks that is run every frame. When you use doMethodLater, Panda will simply wait a certain amount of time before putting your task on that list. The render, collision, etc. subsystems of Panda are also tasks that are part of this list. They will take some amount of time to run, during which time other things can’t be run. Therefore, you cannot guarantee that your function is called precisely at a certain time. And even if all the other Panda tasks took no time at all, the operating system may have scheduled some other process to run on the processor at the time you want your task to run.

What I just said is true only in a single-threaded environment. When you add multiple threads into the picture, things can run in parallel, so your function could run while Panda is rendering. Threads are still heavily limited in Python by the Global Interpreter Lock, though, which means that two bits of Python code can’t run simultaneously: your function can run while Panda is rendering a frame (since that is a C++ function), but if another thread is currently executing a bit of Python code, your function will still have to wait before it can run. Using threading also introduces lots of difficulties because you need to think about things like synchronization, locking, race conditions, etc.

(If you did want to schedule your task in a different thread, just pass the taskChain parameter to the doMethodLater call to force Panda to create a new task chain. You can ask Panda to run this task chain in a separate thread. Consult the manual for more information on this.)

However, because you can’t rely on precise scheduling one way or another, the best solution is just to write your code in such a way that it doesn’t rely on a precise timing. This means taking advantage of delta time, for example, to scale the effects of your calculations. You could also keep track of how many times your function has run within a certain second, so that you can call it multiple times in a frame to “catch up” - this is particularly useful for simulations where running it with a too coarse timestep can cause problems.

greenpanda · October 15, 2023, 8:05pm

The main issue isn’t that it’s not exactly 1.666666ms each time, the issue is that often times it is something like 1.5 or even 3.3ms. That doesn’t make sense considering I’m not doing anything else in the code.

When you use doMethodLater, Panda will simply wait a certain amount of time before putting your task on that list.

But the question is, is the task manager only going through the list of tasks only every frame? Or does doMethodLater function actually get called after the time intervals specified?
This makes less sense for the Sequence-based code snippet. If I want to wait a specified time and then run a function, why would the frame timing have any effect on that?

However, because you can’t rely on precise scheduling one way or another, the best solution is just to write your code in such a way that it doesn’t rely on a precise timing.

I don’t think that’s an option here. I need to call a Panda3d method in precise intervals for diagnostic reasons. I want to write a program that allows to diagnose when a GPU has repeated a frame. That is surprisingly common, apparently.

Observe the full code:

import math
import panda3d.core

panda3d.core.loadPrcFileData("", "fullscreen #t")
panda3d.core.loadPrcFileData("", "win-size 1920 1080")
panda3d.core.loadPrcFileData("", "sync-video #t")
panda3d.core.loadPrcFileData("", "show-frame-rate-meter #t")
# below two make sure the loaded textures are pixel-perfect
panda3d.core.loadPrcFileData("", "textures-auto-power-2 true")
panda3d.core.loadPrcFileData("", "textures-power-2 up")

from direct.showbase.ShowBase import ShowBase
from direct.task import Task
from direct.gui.OnscreenText import OnscreenText

base = ShowBase() # default Panda3d object
base.setBackgroundColor(0,0,0,1)

# GUI frame counter
frame_state = 0
frame_index = 0
frame_counter_gui = OnscreenText(text="frame id: " + str(frame_index), fg = (1,1,1,1), pos=(-0.75, 0), scale=0.2)

def loadImageAsPlane(filepath):
	tex = loader.loadTexture(filepath)
	#tex.setBorderColor(panda3d.core.Vec4(0,0,0,0))
	tex.setWrapU(panda3d.core.Texture.WMBorderColor)
	tex.setWrapV(panda3d.core.Texture.WMBorderColor)
	cm = panda3d.core.CardMaker(filepath + ' card')
	cm.setFrame(-1, 1, -1, 1)
	card = panda3d.core.NodePath(cm.generate())
	card.setTexture(tex)
	card.flattenLight() # apply scale
	return card

grid_ui = loadImageAsPlane("grid.png")
grid_ui.setTransparency(panda3d.core.TransparencyAttrib.MAlpha)
grid_ui.reparentTo(render2d)

scrolling_card_top_x = 0
scrolling_card_top_y = 0
scrolling_card_bottom_x = 0
scrolling_card_bottom_y = 2

cm = panda3d.core.CardMaker("card")
cm.setFrame(-0.05, 0.05, -0.05, 0.05)

scrolling_card_top = panda3d.core.NodePath(cm.generate())
scrolling_card_top.setColorScale(1,1,1,1)
scrolling_card_top.setBin("fixed", 0)
scrolling_card_top_parent = base.render2d.attachNewNode("scrollingCardTopParent")
scrolling_card_top.reparentTo(scrolling_card_top_parent)
scrolling_card_top_parent.setX(-1+1/15)
scrolling_card_top_parent.setZ(1-1/16)

scrolling_card_bottom = panda3d.core.NodePath(cm.generate())
scrolling_card_bottom.setColorScale(1,1,1,1)
scrolling_card_bottom.setBin("fixed", 0)
scrolling_card_bottom_parent = base.render2d.attachNewNode("scrollingCardBottomParent")
scrolling_card_bottom.reparentTo(scrolling_card_bottom_parent)
scrolling_card_bottom_parent.setX(-1+1/15)
scrolling_card_bottom_parent.setZ(-1/16)

def frameIndexUpdater(task):
	global frame_index
	global frame_state
	
	frame_index += 1
	frame_counter_gui.setText("frame id: " + str(frame_index))
	
	if frame_state == 0:
		frame_counter_gui.setX(0.75) # so that LCD monitors won't blend the numbers for sa camera capture
		frame_state = 1
	elif frame_state == 1:
		frame_counter_gui.setX(-0.75)
		frame_state = 0
		
	return task.cont
	
def topCardUpdater(task):
	global scrolling_card_top_x
	global scrolling_card_top_y
	
	scrolling_card_top.setX(2/15 * scrolling_card_top_x)
	scrolling_card_top.setZ(-2/16 * scrolling_card_top_y)
	if scrolling_card_top_x == 14: # index is counted from 0
		scrolling_card_top_x = 0
		if scrolling_card_top_y == 5:
			scrolling_card_top_y = 0
		else:
			scrolling_card_top_y += 1
	else:
		scrolling_card_top_x +=1
	
	return task.again

def bottomCardUpdater(task):
	global scrolling_card_bottom_x
	global scrolling_card_bottom_y
	
	scrolling_card_bottom.setX(2/15 * scrolling_card_bottom_x)
	scrolling_card_bottom.setZ(-2/16 * scrolling_card_bottom_y)
	if scrolling_card_bottom_x == 14: # index is counted from 0
		scrolling_card_bottom_x = 0
		if scrolling_card_bottom_y == 7:
			scrolling_card_bottom_y = 2
		else:
			scrolling_card_bottom_y += 1
	else:
		scrolling_card_bottom_x +=1
	
	return task.cont

frameIndexUpdaterTask = base.taskMgr.add(frameIndexUpdater, "coloredCardAndFrameIndexUpdater")

# we need to wait a while until the framerate stabilizes and we can actually measure it:
def wait(task):
	# a hacky way of determining monitor Hz, Panda3D does not seem to have a way to tell you that number itself
	if int(globalClock.getAverageFrameRate()) in (143,144,145):
		fps = 144
	else:
		fps = round(int(globalClock.getAverageFrameRate()))
	topCardUpdaterTask = base.taskMgr.doMethodLater(1.0/fps, topCardUpdater, "topCardUpdater")
	return task.done
base.taskMgr.doMethodLater(2, wait, "wait")

bottomCardUpdaterTask = base.taskMgr.add(bottomCardUpdater, "bottomCardUpdater")

base.run()

Before I had only the bottom grid. By capturing the output of several 120-240Hz monitors with a high refresh rate camera, I noticed that sometimes the white card stays in place for 2 frames, which hints at the monitor receiving or displaying the same frame twice sometimes.

To confirm this suspicion, I created the top grid, where the white card is positioned not based on the frame refresh, but based on a specific time interval equal to the frame time.
If it worked as expected then unlike the white card in the bottom gird, the one on the top would skip a cell in the grid if it had been displayed on the same cell twice previously.
But since I don’t seem to have set it up right due to how tasks and sequences work, I can’t perform this test, the white bar in the top grid always lags behind.

the image used by the code:

rdb · October 15, 2023, 8:42pm

What does the frame rate meter actually indicate? 60 fps? Because it sounds like Panda (or the GPU driver) is syncing to the vertical blanking interval. That means that when Panda asks the graphics library to submit a frame, the driver waits for the vblank interval, before actually showing the rendered frame and returning control to Panda. So the renderFrame call (in the igLoop task that ShowBase creates) is actually blocking for 16.66… ms, minus the time that the other tasks take to run. Short of multithreading, there’s no way for Panda to run your task or interval even if it wanted to, because the graphics driver is in control of the thread for the vast majority of that 16 ms.

The 33ms sounds like slight variations in frame timing caused the delay not to be met, so it got punted to the next frame. As an experiment, try setting the delay a little lower, say, 10ms, and you might observe that it does get run every 16.66 ms. So, doMethodLater isn’t really meant for these kinds of small delays, so it’s not really going to help you do what you want. It can’t schedule things more often than once per frame. (Neither can intervals, for what it’s worth.)

If you need to run something at a precise interval independently of Panda’s frame timing, you need to look at using multithreading, or a separate process.

As for your specific problem, I can only guess - perhaps Panda has missed the vertical blanking interval (sometimes setting auto-flip true in Config.prc can help with this). Maybe it’d be useful to grab the development version of PStats, which has a new Timeline view that would really help with diagnosing this, in conjunction with pstats-gpu-timing. If you want, and you let me know your OS, I can send you a development build of Panda 1.11 that contains this new feature.

greenpanda · October 15, 2023, 9:18pm

What does the frame rate meter actually indicate? 60 fps? Because it sounds like Panda (or the GPU driver) is syncing to the vertical blanking interval. That means that when Panda asks the graphics library to submit a frame, the driver waits for the vblank interval, before actually showing the rendered frame and returning control to Panda. So the renderFrame call (in the igLoop task that ShowBase creates) is actually blocking for 16.66… ms, minus the time that the other tasks take to run

Yes, the framerate meter seems to show 60. On few monitors it seems to fluctuate between 60 and 59.9. I always thought that the 59.9 was due to some inaccuracy with the fps counter, since the monitor is vsynced. Same with 120Hz and 240Hz monitor. With some the panda fps meter shows solid 120, 144 or 240, some others fluctuate. With one specific monitor the panda fps meter shows a solid 241, for some reason.

Not sure I follow. Yes, vsync is enabled so I’d expect Panda to wait until the vertical sync even when it’s ready. The issue I describe of repeating frames at the monitor happens once every few seconds, it is not in every consecutive frame or cyclical.
I’ve tested and it also seems to happen in the Unity game engine. I’ve tested with 3 separate PCs and 2 operating system so far (Windows and Linux). Only one monitor, one of the builtin displays of one of the gaming notebooks, seems completely free from such issue, even at a whopping 240Hz. Another notebook’s display has the issue even at 120Hz.

If you need to run something at a precise interval independently of Panda’s frame timing, you need to look at using multithreading, or a separate process.

I didn’t find anything about doing multiprocessing with Panda. As for threading, I only found this: Threading — Panda3D Manual

I assume you mean to have a separate thread which will only call the white card repositioning method and then have a thread wait() for the needed time interval?

As an experiment, try setting the delay a little lower, say, 10ms, and you might observe that it does get run every 16.66 ms.

No, it’s still not quite 16.66:

0.016563892364501953
0.016065120697021484
0.016621828079223633
0.017149686813354492
0.01694178581237793
0.016738414764404297
0.01663661003112793
0.01664566993713379
0.016683101654052734
0.01674675941467285
0.016680002212524414
0.016704559326171875
0.016600847244262695
0.016715049743652344
0.016655445098876953
0.016787052154541016
0.016618728637695312
0.01669025421142578
0.016747474670410156
0.01656198501586914
0.01671743392944336
0.016651630401611328
0.016745328903198242
0.016601085662841797
0.01674675941467285
0.016614913940429688
0.01671433448791504
0.016709327697753906
0.016539335250854492
0.0166013240814209
0.016901254653930664
0.01663994789123535
0.016697168350219727
0.01677989959716797
0.016585826873779297
0.016681432723999023
0.016683578491210938
0.016683340072631836
0.016633033752441406
0.016791105270385742
0.01658916473388672
0.016795873641967773
0.016577959060668945
0.01669478416442871
0.016663551330566406
0.016739368438720703
0.01665186882019043
0.016640901565551758
0.01683974266052246
0.016535043716430664
0.01669621467590332
0.01665496826171875
0.016767501831054688
0.016551494598388672
0.01682305335998535
0.016572237014770508
0.0167543888092041
0.016699552536010742
0.016659021377563477
0.0164792537689209
0.01686835289001465
0.01664280891418457
0.016742229461669922
0.0168607234954834
0.016312837600708008
0.016819000244140625
0.016536951065063477
0.01670098304748535
0.016645431518554688
0.01674032211303711
0.016582727432250977
0.016756772994995117
0.016658306121826172
0.016712427139282227
0.01666736602783203
0.016712427139282227
0.016549348831176758
0.01719498634338379
0.016884803771972656
0.016152620315551758
0.01666402816772461
0.01665520668029785
0.016674518585205078
0.016482830047607422
0.016887426376342773
0.016609668731689453
0.0167539119720459
0.016636371612548828
0.01669454574584961
0.016639232635498047
0.016740798950195312
0.016587257385253906
0.016752004623413086
0.016705751419067383
0.016606807708740234
0.016758203506469727
0.016635417938232422
0.016752004623413086
0.016620635986328125
0.016770601272583008
0.016633272171020508
0.016675233840942383
0.01668572425842285
0.016726970672607422
0.016716957092285156
0.016578197479248047
0.016594648361206055
0.01674818992614746
0.016702651977539062
0.016634702682495117
0.016773462295532227
0.016611576080322266
0.016692161560058594
0.016645431518554688
0.01694941520690918
0.016432762145996094
0.016745567321777344
0.01657843589782715
0.01677680015563965
0.01662302017211914
0.016718149185180664
0.016752004623413086
0.01659083366394043
0.016701221466064453
0.016632556915283203
0.016709089279174805
0.016678810119628906
0.016713380813598633
0.016672134399414062
0.01669025421142578
0.016652822494506836
0.016733884811401367
0.016660451889038086
0.01666736602783203
0.01668548583984375
0.016663789749145508
0.016628265380859375
0.016680002212524414
0.016724586486816406
0.016694068908691406
0.016702890396118164
0.01663374900817871
0.01671767234802246
0.016574859619140625
0.016776323318481445
0.016682863235473633
0.01670241355895996
0.016631126403808594
0.01671123504638672
0.01664137840270996
0.01672983169555664
0.016736745834350586
0.016572237014770508
0.016699552536010742
0.016676664352416992
0.016712188720703125
0.016655683517456055
0.016772747039794922
0.01659250259399414
0.016752243041992188
0.016593217849731445
0.01674509048461914
0.016693592071533203
0.016679763793945312
0.016651391983032227
0.016667604446411133
0.016689300537109375
0.01663661003112793
0.016762733459472656
0.01665019989013672
0.016604900360107422
0.01674628257751465
0.016706466674804688
0.016600608825683594
0.016788721084594727
0.016623973846435547
0.016725778579711914
0.0169222354888916
0.016390562057495117
0.016984224319458008
0.0164029598236084
0.016650915145874023
0.0166170597076416
0.016749858856201172
0.01665639877319336
0.016722440719604492
0.016695022583007812
0.01670694351196289
0.016879796981811523
0.016412019729614258
0.01671457290649414
0.016674518585205078
0.01670217514038086
0.01667499542236328
0.016571521759033203
0.016736984252929688
0.01672840118408203
0.016651630401611328
0.016771554946899414
0.01675105094909668
0.01652979850769043
0.0166628360748291
0.01671576499938965
0.01685500144958496
0.0165555477142334
0.01679396629333496
0.016519546508789062
0.016679048538208008
0.016690492630004883
0.016710996627807617
0.016855478286743164
0.016691207885742188
0.016646862030029297
0.016713380813598633
0.016649961471557617
0.016735553741455078
0.016640186309814453
0.016719341278076172
0.016630887985229492
0.016757726669311523
0.01652979850769043
0.016820430755615234
0.016635656356811523
0.016675949096679688
0.016791343688964844
0.01657891273498535
0.016690492630004883
0.016672134399414062
0.016710281372070312
0.016640186309814453
0.016704559326171875
0.017307519912719727
0.015201330184936523
0.016808509826660156
0.016614198684692383
0.0166928768157959
0.016752004623413086
0.017699003219604492
0.016475439071655273
0.016683340072631836
0.016672134399414062
0.016708850860595703
0.016640663146972656
0.0167083740234375
0.016661643981933594
0.016719818115234375
0.01630544662475586
0.016691923141479492
0.016634225845336914
0.016713857650756836
0.01665329933166504
0.016771554946899414
0.01726984977722168
0.01608133316040039
0.016646146774291992
0.016755342483520508
0.01666998863220215
0.01663804054260254
0.016704559326171875
0.01663064956665039
0.01671123504638672
0.016622066497802734
0.016778945922851562
0.016651630401611328
0.01669597625732422
0.01661515235900879
0.016714096069335938
0.016663551330566406
0.017016887664794922
0.017390012741088867
0.015665292739868164
0.01666736602783203
0.016681432723999023
0.016683340072631836
0.016692161560058594
0.016715049743652344
0.016610145568847656
0.016704797744750977
0.016668081283569336
0.01678323745727539
0.016566038131713867
0.01678013801574707
0.01684403419494629
0.01650691032409668
0.01669478416442871
0.016658544540405273
0.01672840118408203
0.016583919525146484
0.01669478416442871
0.01662731170654297
0.016745328903198242
0.01665019989013672
0.016759395599365234
0.016593456268310547
0.0159149169921875
0.01763010025024414
0.01624321937561035
0.01689934730529785
0.016255617141723633
0.016604900360107422
0.016637802124023438
0.01669597625732422
0.016832351684570312
0.016505002975463867
0.016469478607177734
0.016664743423461914
0.01687335968017578
0.017348051071166992
0.016966819763183594
0.016478776931762695
0.01661992073059082
0.01677107810974121
0.01656198501586914
0.016640186309814453
0.016735076904296875
0.016683578491210938
0.0167081356048584
0.01665973663330078
0.016670703887939453
0.016633033752441406
0.016736984252929688
0.016669273376464844
0.01690840721130371
0.016757965087890625
0.01634693145751953
0.016634702682495117
0.016726255416870117
0.016727924346923828
0.0166168212890625
0.01676630973815918
0.016599178314208984
0.016727209091186523
0.016654491424560547
0.016688823699951172
0.01665353775024414
0.016730070114135742
0.016642332077026367
0.01676344871520996
0.01699352264404297
0.016317367553710938
0.016634225845336914
0.016710281372070312
0.01593327522277832
0.016817092895507812
0.017438173294067383
0.016582489013671875
0.016611099243164062
0.01672077178955078
0.016748905181884766
0.01659417152404785
0.016770601272583008
0.01685810089111328
0.0164947509765625
0.01661205291748047
0.016721248626708984
0.016480207443237305
0.016900300979614258
0.01687002182006836
0.01686549186706543
0.016657352447509766
0.016653776168823242
0.016338825225830078
0.016575336456298828
0.01680588722229004
0.016576528549194336
0.01681995391845703
0.01688385009765625
0.016431093215942383
0.01663970947265625
0.016639232635498047
0.01672506332397461
0.016634702682495117
0.016693115234375
0.016612529754638672
0.016889095306396484
0.016524076461791992
0.016776561737060547
0.01674485206604004
0.016628265380859375
0.01665043830871582
0.016691207885742188
0.016867876052856445
0.016475439071655273
0.016680479049682617
0.016694307327270508
0.016818523406982422
0.01665782928466797
0.016628026962280273
0.016524791717529297
0.016844511032104492
0.016536712646484375
0.016814231872558594
0.016669034957885742
0.016590356826782227
0.016756772994995117
0.016658544540405273
0.016836881637573242
0.016609907150268555
0.016691923141479492
0.016731739044189453
0.016744613647460938
0.016582250595092773
0.0170743465423584
0.016553401947021484
0.016444921493530273
0.015979290008544922
0.016774654388427734
0.01638340950012207
0.016641855239868164
0.01650071144104004
0.016929149627685547
0.017748594284057617
0.016537904739379883
0.016585350036621094
0.016910076141357422
0.016095399856567383
0.016112089157104492
0.016721487045288086
0.016898393630981445
0.016629695892333984
0.016541719436645508
0.01668405532836914
0.01684880256652832
0.01750040054321289
0.016704082489013672
0.016662120819091797
0.01679396629333496
0.016663551330566406
0.0163424015045166
0.01671743392944336
0.016757488250732422
0.01648712158203125
0.016880273818969727
0.016667842864990234
0.016759872436523438
0.01639842987060547
0.01687169075012207
0.016667842864990234
0.016718387603759766
0.01672530174255371
0.016577720642089844
0.01674056053161621
0.016652345657348633
0.01674365997314453
0.0166471004486084
0.01671767234802246
0.016585111618041992
0.016720294952392578
0.016661643981933594
0.016724348068237305
0.01666569709777832
0.016686677932739258
0.01663970947265625
0.01671886444091797
0.01665353775024414
0.01702141761779785
0.0167694091796875
0.016557931900024414
0.016745328903198242
0.016574859619140625
0.01679229736328125
0.016611814498901367
0.016777992248535156
0.01659703254699707
0.01676774024963379
0.01657700538635254
0.016451597213745117
0.016640901565551758
0.016736984252929688
0.016649246215820312
0.016643047332763672
0.01709747314453125
0.016647815704345703
0.01583552360534668
0.01663970947265625
0.016507863998413086
0.01661229133605957
0.01667165756225586
0.016745567321777344
0.016639232635498047
0.01662755012512207
0.01653766632080078
0.016991615295410156
0.017636775970458984
0.01675891876220703
0.016605854034423828
0.01675105094909668
0.016537189483642578
0.01681351661682129
0.016573190689086914
0.016780853271484375
0.016648292541503906
0.01673746109008789
0.016602277755737305
0.016736984252929688
0.01665472984313965
0.016629457473754883
0.016861915588378906
0.016558170318603516
0.016720294952392578
0.01662302017211914
0.016636371612548828
0.016774892807006836
0.016696453094482422
0.016619205474853516
0.016699790954589844
0.016585588455200195
0.016799211502075195
0.016625642776489258
0.0167233943939209
0.01663994789123535
0.016699552536010742
0.016746997833251953
0.01658010482788086
0.016807079315185547
0.016587495803833008
0.016744136810302734
0.01659679412841797
0.016631603240966797
0.01670980453491211
0.016607999801635742
0.016785144805908203
0.016599655151367188
0.016463041305541992
0.017026901245117188
0.016620397567749023
0.016751527786254883
0.016583919525146484
0.016785860061645508
0.016697168350219727
0.01660943031311035
0.016973018646240234
0.016569137573242188
0.016318798065185547
0.01639723777770996
0.01676321029663086
0.016609907150268555
0.0168612003326416
0.016663551330566406
0.017085552215576172
0.016602754592895508
0.01655125617980957
0.01681375503540039
0.01659107208251953
0.016810894012451172
0.016320228576660156
0.016033649444580078
0.017388105392456055
0.01663827896118164
0.016654014587402344
0.01689887046813965
0.016627073287963867
0.01652240753173828
0.016626358032226562
0.01672983169555664
0.016652584075927734
0.01675724983215332
0.016796350479125977
0.016522884368896484
0.016668319702148438
0.016669273376464844
0.016782283782958984
0.016748428344726562
0.016541481018066406
0.016684293746948242
0.016666412353515625
0.016706466674804688
0.016769886016845703
0.016675472259521484
0.016632795333862305
0.01681065559387207
0.01659417152404785
0.01660442352294922
0.01678180694580078
0.016598224639892578
0.016786575317382812
0.016542911529541016
0.016797304153442383
0.016782283782958984
0.01650404930114746
0.01680755615234375
0.016347646713256836
0.017220497131347656
0.016272544860839844
0.016724109649658203

But yes, the 33ms ones are gone. Anyway, seems like tasks are useless for this task.

Can’t say I see a difference with auto-flip true
Any ideas when this can happen?
While it happens on the Unity engine as well, it’s still possible that both engines aren’t well optimized for 120+Hz monitors.

I use both Windows and Linux, but tests are probably best performed on Windows where the NVidia driver is better supported and multi-monitor support is better (especially useful when gaming notebooks are used for tests).

rdb · October 15, 2023, 9:50pm

Hmm, I can’t be sure that this is what’s going on; it could be that the process is unlucky enough to get scheduled out by the OS scheduler at the wrong time, or that some other kind of variation caused the render to take a little too long, to miss the interval. Could try raising the process priority to “realtime”. Panda does double buffering by default; at the expense of latency, it may be possible to configure the driver to do triple or quadruple buffering to buffer against longer lag spikes.

Yes, more or less. You can also use a task chain and let Panda manage the threading, but in this case I can see no benefit from that.

Check back in a couple of hours and this folder will fill with various Linux builds, and this folder with Windows builds. Might take a while because the buildbots have a backlog to work through at the moment.

To enable PStats, first run the pstats.exe program and set this in Config.prc before running your application:

want-pstats true
pstats-max-rate 999999

# by default, pstats only retains the latest 60 seconds of profiling data
pstats-history 999999

# enables capture of GPU-side timing data, may be useful to see
pstats-gpu-timing true

# optional, to see individual python functions, but with more overhead
pstats-python-profiler true

Then, capturing will start automatically, and you can stop the application after you think you’ve captured a few repeated frames. You can look at the Timeline view under the Graphs window in PStats and navigate around with the WASD keys or the mouse.

greenpanda · October 15, 2023, 11:38pm

The linux folder seems to be filled with macos builds?

This is the best I could come up with with regards to using threading with Panda:

import time
import panda3d.core
from direct.stdpy import threading

from direct.showbase.ShowBase import ShowBase

base = ShowBase() # default Panda3d object

def high_precision_sleep(duration):
	# python's stock sleep() is very low precision for this use case
	
    start_time = time.perf_counter() # perf_counter is better than time here due to precision
    while True:
        elapsed_time = time.perf_counter() - start_time
        remaining_time = duration - elapsed_time
        if remaining_time <= 0:
            break
        else:
            pass

fps = 120

def periodic_function_call():
	
	while True:
		start_time = time.perf_counter()
		
		pass # function call goes here
		
		end_time = time.perf_counter()
		execution_time = end_time - start_time
		
		if execution_time < 1/fps:
			high_precision_sleep(1/fps - execution_time)
		else:
			print("Warning: Function execution time exceeds 20 ms.")
		end_time = time.perf_counter()
		
		print(end_time - start_time)
	
# Create a separate thread for periodic function calls
thread = threading.Thread(target=periodic_function_call)
thread.daemon = True  # Allow the program to exit when the main thread finishes
thread.start()

base.run()

Even this doesn’t seem accurate enough. After a short while the errors should accumulate into a frame of duration.

BTW, is there a proper way to get the refresh rate of the monitor than this?

def wait(task):
	if int(globalClock.getAverageFrameRate()) in (143,144,145):
		fps = 144
	else:
		fps = round(int(globalClock.getAverageFrameRate()))
	print(fps)
	
	return task.done
base.taskMgr.doMethodLater(2, wait, "wait") # wait two seconds until fps stabilizes

rdb · October 16, 2023, 1:13pm

Yes, sorry, there are Windows and Linux builds here now.

The main issue with your sleeping code is that it’s busy-waiting, hogging the GIL all the while. While that means you’re not dependent on the timing of any sleep function, it does mean that your thread is going to get scheduled out by the Python scheduler after it exceeds its alotted timeslice. If you use a “real” sleep function like time.sleep (or maybe there is a better function to wait until a given deadline) then it would release the GIL momentarily, though you’d be dependent on the precision. This is a complicated matter - you probably can’t really get good consistency with Python. If you wrote a C/C++ module for this, you could get more consistent results, but that’s a lot of work.

Oof, we have the DisplayInformation class, which provides a lot of information including a list of all display modes and the current display mode’s resolution, but I checked and the one thing that’s missing is the current refresh rate. This is an unfortunate omission, I’ve filed this issue to add it, which I’m going to tackle right away since it’s fairly simple to add, so expect it in the upcoming release. Still, the API is currently limited to single-monitor setups, so it needs more work in the future.

greenpanda · October 16, 2023, 1:49pm

I’m not sure if Python GIL hogging is the thing here. Isn’t Panda3D’s threading module reimplementation made in C++? Either way, it’s a sub-millisecond inconsistency vs the 2.5ms one with using Python’s sleep(), I’m not sure what it can be blamed on.

Thank you.
Another thing that seems missing in the docs is how to learn the monitor’s max supported resolution. You can set your program to fullscreen without knowing that, but you can’t set the resolution to the max as well, it seems. base.pipe.getDisplayHeight() and base.pipe.getDisplayWidth() provide the combined frame resolution in multi-monitor setups, even when one of the monitors is turned off/not used by the OS.
EDIT: Found it:

info = base.pipe.getDisplayInformation()
info.getDisplayModeWidth(index), info.getDisplayModeHeight(index))

Not sure if I can determine which monitor index is the one used by the Panda window though.

serega-kkz · October 16, 2023, 4:04pm

I use a script for these purposes, in this case the dictionary uses the screen resolution as the key. The last value of which will be the maximum resolution. To find out the frequency for a specific resolution, just specify the required resolution, the example displays the frequency for desktop resolution.

from direct.showbase.ShowBase import ShowBase

base = ShowBase()

display = {}

desktop_width = base.pipe.get_display_width()
desktop_height = base.pipe.get_display_height()

mode = base.pipe.get_display_information()

for i in range(mode.get_total_display_modes()):
    width  = mode.get_display_mode_width(i)
    height = mode.get_display_mode_height(i)
    rate = mode.get_display_mode_refresh_rate(i)

    if not (width, height) in display:
        display[(width, height)] = [rate]
    else:
        display[(width, height)].append(rate)

# Find out the supported frequency for a specific resolution.
print(display[(desktop_width, desktop_height)][:])

# The maximum possible resolution for the display.
for mode in display:
    print(mode)

base.run()

rdb · October 16, 2023, 4:32pm

The moment Panda’s C++ threading implementation calls your Python code, the normal rules for Python code apply - it grabs the GIL, and in your case probably doesn’t release it until your function exceeds the timeslice allotment.

These functions should more accurately be called “getDesktopWidth” and “getDesktopHeight”, since they return the size of the virtual desktop in which windows are positioned.

I’m adding a function to query the current display mode - but there isn’t a good multi-monitor API just yet.

greenpanda · October 16, 2023, 6:32pm

I’m trying to use your latest panda build for the Pstats on Windows, but it doesn’t seem to install a Python with it as before?

We’ve (with another programmer) also tested the monitor frame repeating issue with pure DirectX. I’m providing you a simplified code here.

This is the issue with pure DirectX, Unity and Panda3D (Unreal to be tested):

We have set the priorities for the thread and the process at the highest possible values that are given by the underlying OS(win32).
We have a most basic render loop which simply clear colors the framebuffers. We simply call swapbuffers and wait for the next Vsync point.
Using any graphics API with double,triple and quadruple buffering offers no solution.
These are the main bullet points of the artifact we are facing right now.

Again, the issue happens on most monitors, but not all, so is not a fact of life.
Most of the monitor which have the issue are gaming monitors, so they don’t have a framebuffer of their own which could cause the issue.
The hardware guys I’ve talked with in various places are adamant this is not issue with the monitor hardware.

This is also something absolutely unacceptable for VR. And when we try a Unity VR version or our program with existing PC VR headsets, the issue isn’t there. Checking the PCBs of the VR headsets, they are identical to regular monitors. They connect to the PCs over DIsplayPort like regular monitors, and have regular display panels inside.

So I’d guess the VR codebase of Unity has a solution for this, but Unity, being so bloated and large, the devs haven’t applied the solution to regular Unity target platforms, which would be benificial for high fps competitive gaming.

If we find the engine-agnostic solution here, it would also benefit Panda3D, if there’s aim in making it usable for projects involving modern VR (90-180Hz) or competitive gaming (144-240Hz).

DX-Renderer.zip (900.4 KB)
DX-Renderer-executable.zip (1.6 MB)

rdb · October 16, 2023, 9:20pm

It should - what makes you think it doesn’t?

I’d be interested in hearing about it if you find out anything more. I know that the presentation APIs in OpenGL and DirectX are quite limited - perhaps there is a way to get more control, via another API like Vulkan, or by disabling vsync and using a lower-level interface to wait for the blanking interval, or some vendor-specific API.

I’m out of the loop on how VR technology works, but I suspect they do get special treatment by the driver with a bespoke presentation path - but I’d need to dive into it to learn more. I know for the Oculus SDK you can’t just use the regular DirectX Present interface, you need to use a function exposed by the SDK, which unfortunately is closed-source so we can’t peer inside. I suspect that’s where the magic is and not in Unity itself.

rdb · October 18, 2023, 10:54pm

I found this extension:
https://registry.khronos.org/OpenGL/extensions/EXT/WGL_EXT_swap_control_tear.txt

Enabling this would mean that missing the vblank interval won’t cause a stall until the next one, but at the cost of possibly seeing a partial frame.

It’s trivial to support this in Panda with a config switch, so I’m happy to add this.

On Linux, it seems there are more extensions for much more precise swap control, such as GLX_OML_sync_control. There is also an extension for querying the number of video syncs that have occurred, GLX_SGI_video_sync.

rdb · October 20, 2023, 1:46pm

On Windows, it may be useful to use this method to get detailed timing information, which may help your debugging. You can get at it using ctypes, I would imagine.

Maybe I can add this information to PStats.

greenpanda · October 20, 2023, 9:05pm

Hi rdb,

Sorry for the delay, I wanted to prepare some data for you but it’s not ready yet so I’ll respond to your messages for now:

There’s no python subfolder in the installation folder.

The extension you pointed out seems to simply make use of the adaptive vsync tech. We have tried it and it causes screen tearings instead of frame repeating, which also produces visible flashes/flicker.
The extension you pointed out is for specific OpenGL version which targets the Web applications. Vulkan also has an extension for using adaptive vsync tech too. Directx11-Directx12 usage is much simpler.

So if it’s not already supported, you could add support for adaptive vsync to Panda3D, and also variable refresh rate tech such as G-Sync/FreeSync/etc. But it wouldn’t solve the issue of frame repeating on regular monitors we are experiencing. Even the monitors which are labeled as supporting Freesync, seem to be hit or miss. In VR, it seems to simply not exist and the display panels expect a fixed refresh rate as input, probably because variable refresh rate would be ruined by VR low perisistence displays and cause flicker/flash or (if the persistence was dynamically increased) motion blur with all of its negative side effects in VR.

I’ll get back to you on this.

rdb · October 21, 2023, 2:33pm

Whoops, thanks for pointing this out. I’ll fix the buildbot, though any external Python installation will also still work - you can just grab an installer from python.org. Or wait until I’ve made a new build.

You’re right, adaptive sync is the commonly used term for this feature. If the tearing on these frame misses is not acceptable for you, then this will not be useful. Nonetheless, I’ve opened an issue to implement it. (For what it’s worth, WGL is for Windows, not the web.)

I’ve done some digging around and learned a couple of things that may or may not be useful to you.

Windows may or may not put the application in “exclusive fullscreen” mode (bypassing the compositor, which is what you want), depending an a couple of factors. See this page.
When not in exclusive fullscreen mode, it may be necessary to sync with the compositor by adding a call to DwmFlush after the SwapBuffers call. Some people have reported this to avoid an occasional frame stutter.
Windows can throttle the application if it waits a lot for VSync, since that can make Windows think that the application is low priority, which can cause occasional frame misses. We need to add code like this to disable this:

if (PowerSetActiveScheme(NULL, &GUID_MIN_POWER_SAVINGS) != 0) {
    std::cerr << "Setting power saving mode failed.\n";
}

greenpanda · January 10, 2024, 3:03pm

Hi rdb,

So I wanted to let you know that we’ve been busy investigating this, just didn’t have an update.

Here’s the latest DirectX test program executable and sourcecode. The test program means to show that the issue is not just with Panda3D, but still if we find a solution it can make its way to the Panda3D source as well.

The test program provided is meant to illustrate, that even when just a blank frame (with a different color each frame) is presented each frame, there’s still an alarming number of repeated frames at the monitor (every 30 seconds or so) with our test laptops (HP Victus 15 2022 model, Aorus 15G-XC, Asus G752VS) even at 120Hz. Here’s the laptop logs, together with a spreadhseet of the results: Program result logs on 3 laptops, Dec 13 2023.7z - pCloud

Blue-red alternating colors is just for the test and is a good way to visually notice the issue clearly as it happens and in a video camera capture. They are easier to spot during testing than stutters, although stutters are still visible themselves.

Here’s video camera captures of the test program on our devices.

(recorded with two cameras simultaneously, one 120fps and one 480fps, logs text files show the offending frame indexes)

Here’s few GIF animations I’ve generated from the videos, of just the few frames where the frame repeat happens. gifs of the artifacts (slow mo) - pCloud

To summarize what the test program does:

2 grids have white cards moving each frame, mouse cursor is also moved each frame (use Atl+F4 to close program). Top grid card and the mouse cursor, unlike the bottom grid card, compensate for the repeated frame as soon as the program becomes aware that it has occurred.

When a frame repeat happens, you notice both the white cards and mouse cursors freezing in place, then:

The mouse cursor jumps a position to compensate for the repeat.
The top card moves a cell, only later jumps a cell to compensate. This is because the next frame was already presented to the GPU and the 3d program couldn’t recall the next presented frame when it learned that the current presented frame had been repeated.
The bottom white card just resumes moving as usual, as it has not been programmed to compensate its position due to a frame repeat.

I’m sure you’ll agree that it’s not okay for the end user to experience this every 30 seconds or so on modern gaming laptops and at as low as 120Hz.

We are simultaneously testing the code using the Nvidia VRWorks API, which is meant for VR and in theory should bypass DWM. We experience the issue with and without using VRWorks. Since VRWorks API requires an NDA and needs an edit to the Windows registry to work with normal monitors, so I am only supplying the pure DirectX version of the program to you for testing.
We also tested the program on an older PC running Windows 7 which is the latest Windows version that still allowed to explicitly bypass DWM via dwmapi.h or when using classic windows themes. The artifacts were still there.
So the issue does not seem to be related to DWM.

To clarify, the issue is not merely that only one frame sometimes stays longer / is displayed twice the duration than it should. There’s actually 2 mis-timed frames when this happens. The first frame gets repeated on the monitor, while the 2nd frame is already rendered and presented to the GPU and is displayed at the time slot of the 3rd expected frame (counted as 3rd if no repeating were to occur).

And often times, the frame is actually repeated twice or more in sequence, not just once.

DirectX SetFullscreenState does not seem to prevent the issue from happening. If there’s an alternative way to enable true fullscreen (I think you’re saying that what we do now is only “windowed” fullscreen?), please let us know.

Going one step ahead: are you aware if there is any way to query the GPU or some other way determine a frame repeat is happening, before the repeating finishes? And/or is there any way to recall a presented frame or some other way halt the streaming of frames to the monitor when a frame is repeated once but hasn’t started streaming the next presented frame yet? This will at least solve the issue of having a frame repeated more than once and prevent later presented frames from being displayed later in time than they should have.

Tried all this without noticing a difference in the results.

Thanks

greenpanda · February 8, 2024, 10:21pm

Bumping this just this once.
Thanks

rdb · February 25, 2024, 8:54am

Sorry for the delayed response. Unfortunately, I’m just as baffled as you are. It sounds like something is happening to miss the VSync deadline. I’d suggest a scheduling hiccup, but you’re already setting the thread to highest priority. Maybe it’s still a scheduling hiccup, but it occurs somewhere in a driver thread. Using some kind of visual timeline analysis tool might help to understand the issue a bit more. PStats has this, but doesn’t currently log vsyncs (maybe a good feature to add).

Have you timed the call to WaitForVBlank - could it be that it’s waiting too long in the preceding frame?

Does having a longer swap chain help?

Perhaps this is an issue to bring to graphics card manufacturers or OS developers.

It should automatically enable “true fullscreen” in OpenGL if certain conditions are met, but it sounds like this is indeed unrelated.

There may be, with the aforementioned OpenGL extensions (on Linux), or on Windows, perhaps you can use DwmGetCompositionTimingInfo, which returns a big struct of potentially useful display timing parameters.