Open Bug 1404312 Opened 7 years ago Updated 2 years ago

[Shield] Pioneer Enrollment - timers (+extended delays) cannot be confirmed during ongoing session

Categories

(Toolkit :: Application Update, defect, P3)

defect

Tracking

()

Tracking Status
firefox-esr52 --- affected
firefox56 --- affected
firefox57 --- affected
firefox58 --- affected
firefox59 --- ?

People

(Reporter: aflorinescu, Unassigned)

Details

Attachments

(1 file)

[Description:]
During testing of the Pioneer Enrollment, we hit a problem with the delays for second prompt, Pioneer Enrollment add-on uninstall and Expire without enrolling. (https://docs.google.com/document/d/12GmLkXU0MXlHhGaY6UeTEoubEDMwaUJJ0u0HpyyKYIE/edit#) 

I think this should be a major severity level until we can asses how bad it is if the actions not being triggered in a fashionable waiting time actually prolong to the session lifespan. (which can be days) 

[Prerequisites:]
Ensure the following preferences are set :
        extensions.pioneer-enrollment-study.updateTimerInterval = 1
        extensions.pioneer-enrollment-study.firstPromptDelay = 1
        extensions.pioneer-enrollment-study.secondPromptDelay = 180000
        extensions.pioneer-enrollment-study.studyEndDelay = 180000
        extensions.pioneer-enrollment-study.studyEnrolledEndDelay = 180000
        extensions.pioneer-enrollment-study.enrollmentState = ‘’ (empty string)

[Steps:]
1. Open FF and set the prerequisites.
2. Install add-on https://net-mozaws-prod-addons-signxpi-output.s3.amazonaws.com/pioneer-enrollment-study-2.0.1.xpi.
3. Click on x to close the notification pop-up and also close the pioneer page.
4. Wait.

[Actual Result:]
The second notification pop-up didn't show. (reached 1h wait time)

[Expected Result:]
The second notification pop-up should show in a reasonable amount of time, given that the delay is set to 3'. 

[Note:]
As a work-around, restarting the browser would do the trick and the 2nd notification would show.
(In reply to Adrian Florinescu [:AdrianSV] from comment #0)

> [Note:]
> As a work-around, restarting the browser would do the trick and the 2nd
> notification would show.

[Note]:
On Mac OS x 10.12, the 2nd notification is displayed only after 3 restarts.
Assignee: nobody → mkelly
I'm unable to replicate this on MacOS 10.12 locally or on Windows 10 in a VM. On both platforms, over several tries, I get reprompted a second time after waiting ~3-5 minutes.

The two main possibilities I can think of:

1. When the notification isn't showing, are there any errors in the browser console? An error might have broken the timer check.

2. Is the clock on the test OS working properly? If the system time is frozen or otherwise not updating, then the timer for reprompting probably won't work either.

You might also try setting the delays from 180000 down to 1. It shouldn't make any difference, but it'd be interesting to see if it does.

Also, are there any other STR that might help replicate this? I've been making sure to close the prompts like comment 1 says, and then I've just left the browser open without using the computer. I've also tested while using the browser to look at reddit.
Flags: needinfo?(roxana.leitan)
Flags: needinfo?(adrian.florinescu)
Update: I was able to replicate this once yesterday, and have been unable to replicate since then. When I checked the timer delay (by setting debugging breakpoints in the timer code), I noticed that the timer interval had been set to 1 day instead of 20 minutes like I had configured the add-on to use.

I haven't been able to isolate any particular steps that cause this, and the delay was fixed after I restarted the browser.
I'm able to reliably replicate this using the steps from bug 1382761 comment 11. I've tracked the issue down to a bug in Firefox with how our update timers work. Depending on the order that timers are registered, it's possible for the update timer to delay too long and ignore timers with smaller delays (I'll be confirming and filing a bug for this tomorrow).

On my test profiles, I observed a delay of ~45 minutes being set. In addition, this bug kicks in after the rest of the timers are processed, which on a new profile would take ~10-20 minutes. This matches up with the description of an hour-long wait time.

I'll be investigating a workaround for this tomorrow; in the meantime, the expected behavior with this bug (and a version of the add-on with the timers set to small delays) would be that Firefox will launch the second prompt on the order of 1-2 hours after the first prompt if the browser isn't restarted, and immediately if the browser is restarted about ~15 minutes after the first prompt appears. I would also expect a used profile (so that initial update timers have all run) that has been running for 10+ minutes to not hit this bug at all.

Since the actual delay we use is 2 days after the first prompt, we probably don't run into this issue when deployed to users; the data we have from the Beta version of this study did not show any abnormalities in the number of users reaching the second prompt.

Given that, I'm comfortable if we observe a second prompt after 2 hours for testing, but if ya'll still think it's a blocker, than we can wait until I come up with a workaround tomorrow.
Using https://bugzilla.mozilla.org/show_bug.cgi?id=1382761#c11, unfortunately we are still hitting waiting times for 40-50'+ for 2nd notification or the automatic add-on uninstall.
Flags: needinfo?(roxana.leitan)
Flags: needinfo?(adrian.florinescu)
So the problem with nsTimerManager is here: http://searchfox.org/mozilla-central/source/toolkit/components/timermanager/nsUpdateTimerManager.js#159

If there are two timers due to be run, and the first timer that gets called with tryFire has a smaller interval than the second one, than skippedFirings will never be set to true, because both timers go through the first branch of that if statement. And because selected was true, nextDelay is never set.

Then, when the second timer fires, nextDelay is set to a new value based on the interval of the second timer: http://searchfox.org/mozilla-central/source/toolkit/components/timermanager/nsUpdateTimerManager.js#257

Finally, when we go to set the delay until the next updateManager tick, nextDelay is set based on the longer interval, and skippedFirings is false, so the delay is set to a potentially large delay despite the fact that the first timer had a shorter interval and was due to be run: http://searchfox.org/mozilla-central/source/toolkit/components/timermanager/nsUpdateTimerManager.js#265

In Pioneer's case, our delay in testing was 1 second, but a telemetry timer that had a 45 minute delay was registered after our timer, and caused a longer delay than the minimum 2 minutes.

I may come back to this to work on writing a test and fix for this, but for now I'm unassigning this to focus on other Pioneer.
Assignee: mkelly → nobody
No longer blocks: 1382761
Component: Shield Study → Application Update
Product: Shield → Toolkit
Can this be reproduced without using extremely small intervals? If not then the better solution would be to clamp those values as necessary in the timer manager since extremely short intervals shouldn't be supported by the timer manager anyways.
Flags: needinfo?(mkelly)
Also, if this timer isn't needed across sessions you should be using a regular timer.
(In reply to Robert Strong PTO 10/3 [:rstrong] (use needinfo to contact me) from comment #7)
> Can this be reproduced without using extremely small intervals? If not then
> the better solution would be to clamp those values as necessary in the timer
> manager since extremely short intervals shouldn't be supported by the timer
> manager anyways.

I believe it should. When I get a chance I can see about writing an XPI to demonstrate the issue; I got stuck trying to write a unit test for it but an XPI should be easier.

(In reply to Robert Strong PTO 10/3 [:rstrong] (use needinfo to contact me) from comment #8)
> Also, if this timer isn't needed across sessions you should be using a
> regular timer.

The normal value for the timer is 11 hours and needs to work across sessions. Luckily, that's large enough that the timer conflict at startup that delays evaluation for ~45 minutes isn't blocking us from launching the Pioneer Enrollment study. I suspect that the most likely case of this failing in the future is a study with, say, a 20 minute timer.
Flags: needinfo?(mkelly)
Here's the aforementioned XPI for reproducing the issue. The source code is here: https://github.com/Osmose/timer-bug-1404312

What it does is register two timers on startup: one with a 2 minute interval, and another with an hour-long interval. The timer names are generated with a timestamp, so they're new timers every time. Because they're new timers, once installed each timer should run once and log a message to the browser console, first the 2 minute timer, then the 1 hour timer.

Once those two run, the bug kicks in, and the delay is set to an hour. You can observe this with the browser toolkit by setting a break point on this line before installing the add-on: http://searchfox.org/mozilla-central/source/toolkit/components/timermanager/nsUpdateTimerManager.js#269. Once that line runs, if you inspect timer.delay, you should be able to see it set to an hour.

For the rest of the hour, the 2 minute timer will not be executed, due to the delay being set incorrectly based on the larger timer.
Priority: -- → P3

In the process of migrating remaining bugs to the new severity system, the severity for this bug cannot be automatically determined. Please retriage this bug using the new severity system.

Severity: major → --
Severity: -- → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: