Notes in preparation for adding IMAP to legacy bounce MailDir paradigm
For imap, each begin of a process should not assume a connection exists or doesn't exist. Check connection using 'imap ping' before login. This should help re-correct any connection drop-outs due to intermittent or one-time connection issues.
Each scheduled event should quit in time for next process, so that imap info being processed is always nearly up-to-date. This is important in case a separate manual imap process is working in tandem and changing circumstances. This is equally important to quit in time, because imap references relative sequences of emails. Two concurrent connections would likely have different and overlapping references. The overlapping references would likely cause issues, since each connection would expect to process the duplicates as if they are not duplicates.
scan_in_est_dur_per_cycle_s
Check scan_incoming_active_p
when running new cycle.
Also set replies_est_next_start
to clock seconds for use with time calcs later in cycle.
If already running, wait a second, check again.. until 90% of duration has elapsed.
If still running, log a message and quit in time for next event.
Each scheduled procedure should also use as much time as it needs up to the cut-off at the next scheduled event. Ideally, it needs to forecast if it is going to go overtime with processing of the next email, and quit just before it does.
Use duration_ms_list
to determine a time adjustment for quitting before next cycle:
scan_in_est_dur_per_cycle_s
+ scan_repies_start_time
=
scan_in_est_quit_cs
And yet, predicting the duration of the future process is difficult. What if the email is 10MB and needs parsed, whereas all prior emails were less than 10kb? What if one of the callbacks converts a pdf into a png and annotates it for a web view and takes a few minutes? What if the next 5 emails have callbacks that take 5 to 15 minutes to process each waiting on an external service?
The process needs to be split into at least two to handle all cases.
The first process collects incoming email and puts it into a system standard format with a minimal amount of effort sufficient for use by callbacks. The goal of this process is to keep up with incoming email to all mail available to the system at the earliest possible moment.
The second process should render a prioritized queue of imported email that have not been processed. First prioritizing new entries, perhaps re-prioritizing any callbacks that error or sampling re-introducing prior errant callbacks etc. then continuing to process the stack.
Using this paradigm, parallel processes could be invoked for the queue without significantly changing the paradigm.
To reduce overhead on low volume systems, these processes should be scheduled to minimize concurrent operation.
Priorities should offer 3 levels of performance. Colors designate priority to discern from other email priority schemes:
Priority is calculated based on timing and file size
set range priority_max - priority_min set deviation_max { ($range / 2 } set midpoint { priority_min + $deviation_max } time_priority = $deviation_max ( clock seconds of received datetime - scan_in_start_cs ) / ( 2 * scan_in_est_dur_per_cycle_s ) size_priority = $deviation_max * (( (size of email in characters)/(config.tcl's max_file_upload_mb *1000000) ) - 0.5) set equation = int( $midpoint + ($time_priority + size_priority) / 2)
Average of time and file size priorities.
hpri_package_ids and lpri_package_ids and hpri_party_ids and lpri_party_ids and mpri_min and mpri_max and hpri_subject_glob and lpri_subject_glob are defined in acs_maile_lite_ui, so they can be tuned without restarting server. ps. Code should check if user is banned before parsing any further.
A proc should be available to recalculate existing email priorities. This means more info needs to be added to table acs_mail_lite_from_external (including size_chars)
This scheduling should be simple. Maybe check if a new process wants to take over. If so, quit.
If next cycle starts and current cycle is still running,
set scan_in_est_dur_per_cycle_s_override
to actual wait time the current cycle has to wait including any prior cycle wait time --if the delays exceed one cycle (accumulative_delay_cycles
.
From acs-tcl/tcl/test/ad-proc-test-procs.tcl # This example gets list of implementations of a callback: (so they could be triggered one by one) ad_proc -callback a_callback { -arg1 arg2 } { this is a test callback } - set callback_procs [info commands ::callback::a_callback::*]
Each subsequent cycle moves toward renormalization by adjusting
scan_in_est_dur_per_cycle_s_override
toward value of
scan_in_est_dur_per_cycle_s
by one
replies_est_dur_per_cycle
with minimum of
scan_in_est_dur_per_cycle_s
.
Changes are exponential to quickly adjust to changing dynamics.
For acs_mail_lite::scan_in,
Keep track of email flags while processing.
Mark /read when reading.
Mark /replied if replying.
When quitting current scheduled event, don't log out if all processes are not done.
Also, don't logout if imaptimeout
is greater than duration to cycle_est_next_start_cs
.
Stay logged in for next cycle.
Delete processed messages when done with a cycle? No. What if message is used by a callback with delay in processing? Move processed emails in a designated folder ProcessFolderName parameter. Designated folder may be Trash. Set ProcessFolderName by parameter If empty, Default is hostname of ad_url ie: [util::split_location [ad_url] protoVar ProcessFolderName portVar] If folder does not exist, create it. ProcessFolderName only needs checked if name has changed.
MailDir marks email as 'read' by moving from '/new' dir to '/cur' directory. ACS Mail Lite implementations should be consistent as much as possible, and so mark emails in IMAP as 'read' also.
Since messages are not immediately deleted, create a table of attachment url references. Remove attachments older than AttachmentLife parameter seconds. Set default to 30 days old (2592000 seconds). Unless ProcessFolderName is Trash, email attachments can be recovered by original email in ProcessFolderName. No. Once callbacks are processed, assume any transfer of attachments has occurred, so that processed email can be purged.