Skip to content

Fix upsert speakers during schedule import#3433

Open
ArnavBallinCode wants to merge 11 commits intofossasia:devfrom
ArnavBallinCode:upsert-speakers
Open

Fix upsert speakers during schedule import#3433
ArnavBallinCode wants to merge 11 commits intofossasia:devfrom
ArnavBallinCode:upsert-speakers

Conversation

@ArnavBallinCode
Copy link
Copy Markdown
Member

@ArnavBallinCode ArnavBallinCode commented Apr 27, 2026

Summary by Sourcery

Upsert and normalize speaker users during schedule import to better reuse existing profiles and avoid duplicate accounts.

New Features:

  • Create or update speaker user accounts when linking speakers to submissions during schedule import, including generating new users when necessary.

Enhancements:

  • Normalize and validate email addresses and CSV speaker fields when importing talks.
  • Reuse and update existing SpeakerProfile records by matching on speaker references, names, or prior profiles, and cache lookups by both reference and name.
  • Populate missing user email and code fields when safe, while enforcing uniqueness constraints during import.

Copilot AI review requested due to automatic review settings April 27, 2026 05:23
@sourcery-ai
Copy link
Copy Markdown
Contributor

sourcery-ai Bot commented Apr 27, 2026

Reviewer's Guide

Enhances the schedule import to robustly upsert speakers by normalizing references, creating/updating users and speaker profiles, and improving how speakers are matched and linked to submissions and events.

Sequence diagram for upserting session speakers during submission import

sequenceDiagram
    participant ImportTask
    participant TalkImport as TalkImportModule
    participant ImportSubmission as _import_submission_row
    participant UpsertSpeaker as _upsert_session_speaker
    participant UserModel as User
    participant SpeakerProfileModel as SpeakerProfile
    participant SpeakerRoleModel as SpeakerRole

    ImportTask->>TalkImport: import_speakers(event, fileid, settings, locale, user_id)
    TalkImport->>ImportSubmission: _import_submission_row(event, settings, record, acting_user, speaker_cache)

    loop for each submission row
        ImportSubmission->>ImportSubmission: _zip_speaker_refs_and_names(linked_speakers, speakers_val)
        ImportSubmission->>ImportSubmission: build cache_key for each pair
        alt cache_miss
            ImportSubmission->>UpsertSpeaker: _upsert_session_speaker(event, speaker_ref, speaker_name)
            Note right of UpsertSpeaker: Match or create User and SpeakerProfile
            UpsertSpeaker->>UserModel: _find_user_for_speaker by ref or name
            alt user_found
                UpsertSpeaker->>UpsertSpeaker: _normalize_email_address(speaker_ref)
                UpsertSpeaker->>UpsertSpeaker: _normalize_speaker_identifier(speaker_ref)
                UpsertSpeaker->>UserModel: update fullname, email, code
                UserModel-->>UpsertSpeaker: user_saved
            else user_not_found
                UpsertSpeaker->>UpsertSpeaker: derive fallback_name
                UpsertSpeaker->>UserModel: create_user(password, email, fullname, code, pw_reset_token, pw_reset_time)
                UserModel-->>UpsertSpeaker: new_user
            end
            UpsertSpeaker->>SpeakerProfileModel: get_or_create(user, event)
            SpeakerProfileModel-->>UpsertSpeaker: speaker_profile
            UpsertSpeaker-->>ImportSubmission: user
            ImportSubmission->>ImportSubmission: cache[user_ref_and_name] = user
        else cache_hit
            ImportSubmission->>ImportSubmission: user = speaker_cache[cache_key]
        end
        alt user_exists
            ImportSubmission->>SpeakerRoleModel: get_or_create(submission, user)
            SpeakerRoleModel-->>ImportSubmission: speaker_role
        end
    end
Loading

ER diagram for user, speaker profile, and speaker role upsert relationships

erDiagram
    USER {
        int id
        string email
        string fullname
        string code
        string password
        string pw_reset_token
        datetime pw_reset_time
    }

    EVENT {
        int id
        string name
    }

    SUBMISSION {
        int id
        string title
        int event_id
    }

    SPEAKERPROFILE {
        int id
        int user_id
        int event_id
    }

    SPEAKERROLE {
        int id
        int submission_id
        int user_id
    }

    USER ||--o{ SPEAKERPROFILE : has
    EVENT ||--o{ SPEAKERPROFILE : has

    USER ||--o{ SPEAKERROLE : has
    SUBMISSION ||--o{ SPEAKERROLE : has

    EVENT ||--o{ SUBMISSION : has
Loading

Class diagram for updated talk import and speaker upsert logic

classDiagram
    class TalkImportModule {
        +_normalize_email_address(value: str) str
        +_split_csv_values(raw_value: str) list~str~
        +_zip_speaker_refs_and_names(linked_speakers: str, speakers_val: str) list~tuple~
        +_upsert_session_speaker(event, speaker_ref: str, speaker_name: str)
        +_import_speaker_row(event, settings: dict, record, acting_user)
        +_import_submission_row(event, settings: dict, record, acting_user, speaker_cache: dict)
    }

    class User {
        +id: int
        +email: str
        +fullname: str
        +code: str
        +password: str
        +pw_reset_token: str
        +pw_reset_time: datetime
        +create_user(password: str, email: str, fullname: str, code: str, pw_reset_token: str, pw_reset_time: datetime) User
        +save(update_fields: list~str~) void
        +objects_filter_email_iexaxt(email: str) QuerySet~User~
        +objects_filter_code_iexact(code: str) QuerySet~User~
    }

    class Event {
        +id: int
        +name: str
    }

    class Submission {
        +id: int
        +title: str
        +event: Event
    }

    class SpeakerProfile {
        +id: int
        +user: User
        +event: Event
        +objects_get_or_create(user: User, event: Event) SpeakerProfile
    }

    class SpeakerRole {
        +id: int
        +submission: Submission
        +user: User
        +objects_get_or_create(submission: Submission, user: User) SpeakerRole
    }

    TalkImportModule ..> User : creates_or_updates
    TalkImportModule ..> SpeakerProfile : creates
    TalkImportModule ..> SpeakerRole : creates
    TalkImportModule ..> Event
    TalkImportModule ..> Submission

    User "1" -- "many" SpeakerProfile : has
    Event "1" -- "many" SpeakerProfile : has
    User "1" -- "many" SpeakerRole : has
    Submission "1" -- "many" SpeakerRole : has
    Event "1" -- "many" Submission : has
Loading

File-Level Changes

Change Details Files
Introduce helper functions for normalizing speaker data and pairing speaker references with names for use during import.
  • Add email normalization with validation to safely derive canonical email addresses from raw input values.
  • Add CSV splitting helper that trims and filters empty values from comma-separated fields.
  • Add utility to zip linked_speakers and speakers values into aligned (ref, name) pairs, padding and discarding empty pairs as needed.
app/eventyay/base/services/talkimport.py
Implement _upsert_session_speaker to create or update users and speaker profiles based on schedule import data.
  • Derive candidate user by matching speaker reference or name using existing _find_user_for_speaker logic.
  • Normalize and classify the speaker reference as either an email or a generic identifier, enforcing maximum fullname length.
  • Update existing users’ fullname, email, and code fields selectively if new data is present and unique, saving only changed fields.
  • Create new users with generated password/reset values when no match exists, choosing a sensible fallback fullname from name, identifier, or email local-part.
  • Ensure a SpeakerProfile exists for each upserted user within the event via get_or_create.
app/eventyay/base/services/talkimport.py
Improve speaker import row handling so that existing speaker profiles can be matched by fullname and email safely updated.
  • When no user is found by email, attempt to resolve a SpeakerProfile for the event by case-insensitive fullname and use its user.
  • Extend user saving in _import_speaker_row to conditionally set email when missing and unique, while still updating fullname and optional fields.
  • Normalize email values consistently before uniqueness checks and updates.
app/eventyay/base/services/talkimport.py
Change submission speaker linking to use the new upsert logic and cache by both reference and name.
  • Replace flat list of speaker references with paired speaker_ref/speaker_name tuples from the new zipping helper.
  • Build speaker_cache keys from both lowercased reference and name, reducing collisions when the same ref has different names.
  • Use _upsert_session_speaker when populating the speaker_cache so that users and speaker profiles are created/updated as part of linking.
  • Keep existing behavior of creating SpeakerRole records for each resolved speaker on the submission.
app/eventyay/base/services/talkimport.py

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link
Copy Markdown
Contributor

@sourcery-ai sourcery-ai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey - I've found 1 issue, and left some high level feedback:

  • In _import_speaker_row, normalized_email = email.lower().strip() will throw if email is None; consider reusing _normalize_email_address or guarding for None/empty values before calling string methods.
  • The new email normalization/uniqueness checks are implemented slightly differently in _normalize_email_address and _import_speaker_row; consider consolidating to a single helper to ensure consistent validation and normalization of email fields across import paths.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- In `_import_speaker_row`, `normalized_email = email.lower().strip()` will throw if `email` is `None`; consider reusing `_normalize_email_address` or guarding for `None`/empty values before calling string methods.
- The new email normalization/uniqueness checks are implemented slightly differently in `_normalize_email_address` and `_import_speaker_row`; consider consolidating to a single helper to ensure consistent validation and normalization of email fields across import paths.

## Individual Comments

### Comment 1
<location path="app/eventyay/base/services/talkimport.py" line_range="174-183" />
<code_context>
+    if normalized_ref and not normalized_email:
+        normalized_identifier = _normalize_speaker_identifier(normalized_ref)
+
+    if user:
+        update_fields = []
+        if normalized_name and user.fullname != normalized_name:
+            user.fullname = normalized_name
+            update_fields.append('fullname')
+        if normalized_email and not user.email:
+            if not User.objects.filter(email__iexact=normalized_email).exclude(pk=user.pk).exists():
+                user.email = normalized_email
+                update_fields.append('email')
+        if (
+            normalized_identifier
+            and not user.code
+            and not User.objects.filter(code__iexact=normalized_identifier).exclude(pk=user.pk).exists()
+        ):
+            user.code = normalized_identifier
+            update_fields.append('code')
+        if update_fields:
+            user.save(update_fields=update_fields)
+    else:
+        fallback_name = normalized_name
</code_context>
<issue_to_address>
**issue (bug_risk):** Concurrent imports could race on email/code uniqueness checks in `_upsert_session_speaker`.

This `exists()`-then-assign pattern for `email`/`code` is subject to a race if two workers import the same speaker concurrently: both can pass the check and then one will hit an `IntegrityError` on save due to the DB unique constraint. If this function can run in parallel, consider wrapping these updates in a tighter transaction and/or catching `IntegrityError` around `save()`/`create_user` and resolving it (e.g. by re-fetching or dropping the conflicting change).
</issue_to_address>

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Comment thread app/eventyay/base/services/talkimport.py
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the schedule/session CSV import flow to upsert speaker users (create or update User + ensure SpeakerProfile) when linking speakers during submission import, improving deduplication between “speaker import” and “submission import” paths.

Changes:

  • Added helpers to normalize/validate emails, split CSV values, zip speaker refs with speaker names, and upsert a session speaker (_upsert_session_speaker).
  • Enhanced speaker import to match existing speakers by per-event SpeakerProfile name and to backfill missing User.email when possible.
  • Updated submission import speaker-linking to use paired speaker ref/name inputs and upsert speaker users during linking.
Comments suppressed due to low confidence (1)

app/eventyay/base/services/talkimport.py:668

  • In the speaker-linking loop, SpeakerProfile.objects.get_or_create(user=speaker_user, event=event) is now redundant because _upsert_session_speaker() already does SpeakerProfile.objects.get_or_create(user=user, event=event) before returning. Keeping both causes an extra DB query per linked speaker; remove the second call (or drop it from _upsert_session_speaker and keep it in the caller, but not both).
                if speaker_user:
                    SpeakerRole.objects.get_or_create(submission=submission, user=speaker_user)
                    SpeakerProfile.objects.get_or_create(user=speaker_user, event=event)

Comment thread app/eventyay/base/services/talkimport.py
Comment thread app/eventyay/base/services/talkimport.py Outdated
Copilot AI review requested due to automatic review settings April 29, 2026 08:58
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

@ArnavBallinCode ArnavBallinCode changed the title Upsert speakers during schedule import Fix upsert speakers during schedule import Apr 29, 2026
@Saksham-Sirohi Saksham-Sirohi requested a review from Copilot April 29, 2026 09:59
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (1)

app/eventyay/base/services/talkimport.py:720

  • _upsert_session_speaker() already ensures SpeakerProfile.objects.get_or_create(user=user, event=event) before returning. The additional SpeakerProfile.objects.get_or_create() here is redundant and adds an extra DB query per linked speaker. Consider removing this duplicate call (or make _upsert_session_speaker optionally skip profile creation if you want to keep that responsibility at the call site).
                if speaker_user:
                    SpeakerRole.objects.get_or_create(submission=submission, user=speaker_user)
                    SpeakerProfile.objects.get_or_create(user=speaker_user, event=event)

Comment thread app/eventyay/base/services/talkimport.py Outdated
Comment thread app/eventyay/base/services/talkimport.py Outdated
- Process linked_speakers and speakers independently when list lengths
  differ to avoid pairing wrong name with wrong reference
- Refresh user from DB after IntegrityError retry to prevent stale
  email/code in memory
- Remove redundant SpeakerProfile.get_or_create (already done inside
  _upsert_session_speaker)
…ementError

GenerateCode.save() has an internal IntegrityError retry loop that
catches constraint violations and calls assign_code(). In PostgreSQL,
this poisons the transaction and causes TransactionManagementError.

Check uniqueness before setting email/code fields so the
IntegrityError never fires. Also remove unused
_refetch_speaker_after_conflict helper.
Copilot AI review requested due to automatic review settings April 29, 2026 18:46
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 4 comments.

Comment thread app/eventyay/base/services/talkimport.py Outdated
Comment thread app/eventyay/base/services/talkimport.py
Comment thread app/eventyay/base/services/talkimport.py Outdated
Comment thread app/eventyay/base/services/talkimport.py Outdated
Copilot AI review requested due to automatic review settings April 30, 2026 12:41
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 3 comments.

Comment thread app/eventyay/base/services/talkimport.py
Comment thread app/eventyay/base/services/talkimport.py
Comment thread app/eventyay/base/services/talkimport.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: Backlog

Development

Successfully merging this pull request may close these issues.

2 participants