User:PrimeBOT/Task 17

    From Wikipedia, the free encyclopedia

    Status and updates for Task 17

    List of params[edit]

    Regex updates[edit]

    because these things are boring

    Original

    • \??(?:&?utm_[^=]*?=[^&\s\]\|]*)+(?=]|\s|\|)|(?<=\?)(?:&?utm_[^=]*?=[^&\s\]\|]*)+&

    27 May (BRFA trial) - add green code to catch utm_ params in the middle, and catching more end-of-URL possibilities

    • \??(?:&?utm_[^=\s]*?=[^&\s\]\|]*?)+(?=}|]|\s|\|)|(?<=\?)(?:&?utm_[^=\s]*?=[^&\s\]\|]*)+&|(?<=&)(?:&?utm_[^=\s]*?=[^&\s\]\|]*)+&

    7 June (catch ref tags) - add < to end-of-check exceptions

    • \??(?:&?utm_[^=\s]*?=[^&\s\]\|]*?)+(?=<|}|]|\s|\|)|(?<=\?)(?:&?utm_[^=\s]*?=[^&\s\]\|]*)+&|(?<=&)(?:&?utm_[^=\s]*?=[^&\s\]\|]*)+&

    8 June (catch malformed utm_ params) - utm_ must be followed by text and an =

    • \??(?:&?utm_[^=\s\|<}\]]*?=[^&\s\]\|]*?)+(?=<|}|]|\s|\|)|(?<=\?)(?:&?utm_[^=\s\|<}\]]*?=[^&\s\]\|]*)+&|(?<=&)(?:&?utm_[^=\s\|<}\]]*?=[^&\s\]\|]*)+&

    10 June (avoid web archive links)

    • (?<!https://web.archive.org[\S]+)(\??(?:&?utm_[^=\s\|<}\]]*?=[^&\s\]\|]*?)+(?=<|}|]|\s|\|)|(?<=\?)(?:&?utm_[^=\s\|<}\]]*?=[^&\s\]\|]*)+&|(?<=&)(?:&?utm_[^=\s\|<}\]]*?=[^&\s\]\|]*)+&)

    1 July (avoid _utms just hanging out in text)

    • (?<!https://web.archive.org[\S]+|\||\s)(\??(?:&?utm_[^=\s\|<}\]]*?=[^&\s\]\|]*?)+(?=<|}|]|\s|\|)|(?<=\?)(?:&?utm_[^=\s\|<}\]]*?=[^&\s\]\|]*)+&|(?<=&)(?:&?utm_[^=\s\|<}\]]*?=[^&\s\]\|]*)+&)