Fix upsert speakers during schedule import#3433
Fix upsert speakers during schedule import#3433ArnavBallinCode wants to merge 11 commits intofossasia:devfrom
Conversation
Reviewer's GuideEnhances the schedule import to robustly upsert speakers by normalizing references, creating/updating users and speaker profiles, and improving how speakers are matched and linked to submissions and events. Sequence diagram for upserting session speakers during submission importsequenceDiagram
participant ImportTask
participant TalkImport as TalkImportModule
participant ImportSubmission as _import_submission_row
participant UpsertSpeaker as _upsert_session_speaker
participant UserModel as User
participant SpeakerProfileModel as SpeakerProfile
participant SpeakerRoleModel as SpeakerRole
ImportTask->>TalkImport: import_speakers(event, fileid, settings, locale, user_id)
TalkImport->>ImportSubmission: _import_submission_row(event, settings, record, acting_user, speaker_cache)
loop for each submission row
ImportSubmission->>ImportSubmission: _zip_speaker_refs_and_names(linked_speakers, speakers_val)
ImportSubmission->>ImportSubmission: build cache_key for each pair
alt cache_miss
ImportSubmission->>UpsertSpeaker: _upsert_session_speaker(event, speaker_ref, speaker_name)
Note right of UpsertSpeaker: Match or create User and SpeakerProfile
UpsertSpeaker->>UserModel: _find_user_for_speaker by ref or name
alt user_found
UpsertSpeaker->>UpsertSpeaker: _normalize_email_address(speaker_ref)
UpsertSpeaker->>UpsertSpeaker: _normalize_speaker_identifier(speaker_ref)
UpsertSpeaker->>UserModel: update fullname, email, code
UserModel-->>UpsertSpeaker: user_saved
else user_not_found
UpsertSpeaker->>UpsertSpeaker: derive fallback_name
UpsertSpeaker->>UserModel: create_user(password, email, fullname, code, pw_reset_token, pw_reset_time)
UserModel-->>UpsertSpeaker: new_user
end
UpsertSpeaker->>SpeakerProfileModel: get_or_create(user, event)
SpeakerProfileModel-->>UpsertSpeaker: speaker_profile
UpsertSpeaker-->>ImportSubmission: user
ImportSubmission->>ImportSubmission: cache[user_ref_and_name] = user
else cache_hit
ImportSubmission->>ImportSubmission: user = speaker_cache[cache_key]
end
alt user_exists
ImportSubmission->>SpeakerRoleModel: get_or_create(submission, user)
SpeakerRoleModel-->>ImportSubmission: speaker_role
end
end
ER diagram for user, speaker profile, and speaker role upsert relationshipserDiagram
USER {
int id
string email
string fullname
string code
string password
string pw_reset_token
datetime pw_reset_time
}
EVENT {
int id
string name
}
SUBMISSION {
int id
string title
int event_id
}
SPEAKERPROFILE {
int id
int user_id
int event_id
}
SPEAKERROLE {
int id
int submission_id
int user_id
}
USER ||--o{ SPEAKERPROFILE : has
EVENT ||--o{ SPEAKERPROFILE : has
USER ||--o{ SPEAKERROLE : has
SUBMISSION ||--o{ SPEAKERROLE : has
EVENT ||--o{ SUBMISSION : has
Class diagram for updated talk import and speaker upsert logicclassDiagram
class TalkImportModule {
+_normalize_email_address(value: str) str
+_split_csv_values(raw_value: str) list~str~
+_zip_speaker_refs_and_names(linked_speakers: str, speakers_val: str) list~tuple~
+_upsert_session_speaker(event, speaker_ref: str, speaker_name: str)
+_import_speaker_row(event, settings: dict, record, acting_user)
+_import_submission_row(event, settings: dict, record, acting_user, speaker_cache: dict)
}
class User {
+id: int
+email: str
+fullname: str
+code: str
+password: str
+pw_reset_token: str
+pw_reset_time: datetime
+create_user(password: str, email: str, fullname: str, code: str, pw_reset_token: str, pw_reset_time: datetime) User
+save(update_fields: list~str~) void
+objects_filter_email_iexaxt(email: str) QuerySet~User~
+objects_filter_code_iexact(code: str) QuerySet~User~
}
class Event {
+id: int
+name: str
}
class Submission {
+id: int
+title: str
+event: Event
}
class SpeakerProfile {
+id: int
+user: User
+event: Event
+objects_get_or_create(user: User, event: Event) SpeakerProfile
}
class SpeakerRole {
+id: int
+submission: Submission
+user: User
+objects_get_or_create(submission: Submission, user: User) SpeakerRole
}
TalkImportModule ..> User : creates_or_updates
TalkImportModule ..> SpeakerProfile : creates
TalkImportModule ..> SpeakerRole : creates
TalkImportModule ..> Event
TalkImportModule ..> Submission
User "1" -- "many" SpeakerProfile : has
Event "1" -- "many" SpeakerProfile : has
User "1" -- "many" SpeakerRole : has
Submission "1" -- "many" SpeakerRole : has
Event "1" -- "many" Submission : has
File-Level Changes
Tips and commandsInteracting with Sourcery
Customizing Your ExperienceAccess your dashboard to:
Getting Help
|
There was a problem hiding this comment.
Hey - I've found 1 issue, and left some high level feedback:
- In
_import_speaker_row,normalized_email = email.lower().strip()will throw ifemailisNone; consider reusing_normalize_email_addressor guarding forNone/empty values before calling string methods. - The new email normalization/uniqueness checks are implemented slightly differently in
_normalize_email_addressand_import_speaker_row; consider consolidating to a single helper to ensure consistent validation and normalization of email fields across import paths.
Prompt for AI Agents
Please address the comments from this code review:
## Overall Comments
- In `_import_speaker_row`, `normalized_email = email.lower().strip()` will throw if `email` is `None`; consider reusing `_normalize_email_address` or guarding for `None`/empty values before calling string methods.
- The new email normalization/uniqueness checks are implemented slightly differently in `_normalize_email_address` and `_import_speaker_row`; consider consolidating to a single helper to ensure consistent validation and normalization of email fields across import paths.
## Individual Comments
### Comment 1
<location path="app/eventyay/base/services/talkimport.py" line_range="174-183" />
<code_context>
+ if normalized_ref and not normalized_email:
+ normalized_identifier = _normalize_speaker_identifier(normalized_ref)
+
+ if user:
+ update_fields = []
+ if normalized_name and user.fullname != normalized_name:
+ user.fullname = normalized_name
+ update_fields.append('fullname')
+ if normalized_email and not user.email:
+ if not User.objects.filter(email__iexact=normalized_email).exclude(pk=user.pk).exists():
+ user.email = normalized_email
+ update_fields.append('email')
+ if (
+ normalized_identifier
+ and not user.code
+ and not User.objects.filter(code__iexact=normalized_identifier).exclude(pk=user.pk).exists()
+ ):
+ user.code = normalized_identifier
+ update_fields.append('code')
+ if update_fields:
+ user.save(update_fields=update_fields)
+ else:
+ fallback_name = normalized_name
</code_context>
<issue_to_address>
**issue (bug_risk):** Concurrent imports could race on email/code uniqueness checks in `_upsert_session_speaker`.
This `exists()`-then-assign pattern for `email`/`code` is subject to a race if two workers import the same speaker concurrently: both can pass the check and then one will hit an `IntegrityError` on save due to the DB unique constraint. If this function can run in parallel, consider wrapping these updates in a tighter transaction and/or catching `IntegrityError` around `save()`/`create_user` and resolving it (e.g. by re-fetching or dropping the conflicting change).
</issue_to_address>Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.
There was a problem hiding this comment.
Pull request overview
This PR updates the schedule/session CSV import flow to upsert speaker users (create or update User + ensure SpeakerProfile) when linking speakers during submission import, improving deduplication between “speaker import” and “submission import” paths.
Changes:
- Added helpers to normalize/validate emails, split CSV values, zip speaker refs with speaker names, and upsert a session speaker (
_upsert_session_speaker). - Enhanced speaker import to match existing speakers by per-event
SpeakerProfilename and to backfill missingUser.emailwhen possible. - Updated submission import speaker-linking to use paired speaker ref/name inputs and upsert speaker users during linking.
Comments suppressed due to low confidence (1)
app/eventyay/base/services/talkimport.py:668
- In the speaker-linking loop,
SpeakerProfile.objects.get_or_create(user=speaker_user, event=event)is now redundant because_upsert_session_speaker()already doesSpeakerProfile.objects.get_or_create(user=user, event=event)before returning. Keeping both causes an extra DB query per linked speaker; remove the second call (or drop it from_upsert_session_speakerand keep it in the caller, but not both).
if speaker_user:
SpeakerRole.objects.get_or_create(submission=submission, user=speaker_user)
SpeakerProfile.objects.get_or_create(user=speaker_user, event=event)
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 1 out of 1 changed files in this pull request and generated 2 comments.
Comments suppressed due to low confidence (1)
app/eventyay/base/services/talkimport.py:720
- _upsert_session_speaker() already ensures SpeakerProfile.objects.get_or_create(user=user, event=event) before returning. The additional SpeakerProfile.objects.get_or_create() here is redundant and adds an extra DB query per linked speaker. Consider removing this duplicate call (or make _upsert_session_speaker optionally skip profile creation if you want to keep that responsibility at the call site).
if speaker_user:
SpeakerRole.objects.get_or_create(submission=submission, user=speaker_user)
SpeakerProfile.objects.get_or_create(user=speaker_user, event=event)
- Process linked_speakers and speakers independently when list lengths differ to avoid pairing wrong name with wrong reference - Refresh user from DB after IntegrityError retry to prevent stale email/code in memory - Remove redundant SpeakerProfile.get_or_create (already done inside _upsert_session_speaker)
…ementError GenerateCode.save() has an internal IntegrityError retry loop that catches constraint violations and calls assign_code(). In PostgreSQL, this poisons the transaction and causes TransactionManagementError. Check uniqueness before setting email/code fields so the IntegrityError never fires. Also remove unused _refetch_speaker_after_conflict helper.
…consolidate query
…eventyay into upsert-speakers
Summary by Sourcery
Upsert and normalize speaker users during schedule import to better reuse existing profiles and avoid duplicate accounts.
New Features:
Enhancements: