Meeting Minutes for 08/04/2020

Discovery Committee Special Meeting
August 4, 2020
Minutes

 

All test servers are now using the new grouping logic and mechanisms

  • July was spent migrating all the regular test servers to the new grouping algorithm and constructing a process to convert IDs and user data to IDs in the new version.  
  • August is the final month for a review of grouping improvements before all the changes are made live in production with the September 1st deployment. 
  • Test servers are a preview of how the grouping improvements will work and display in production after implementation, so review your test sites to evaluate the new grouping to uncover any issues that need attention before they go into production.
  • Once the grouping is in production it is hard to fix without it affecting the groupwork ID which will affect user data.
  • Grouping title size was increased to 400 characters because titles, especially government documents, would have long titles that should not group together.  Since the shorter titles were cutting off part of the longer title, those titles were grouping.
  • The author-name size was increased to 100 characters.
  • Steps of the Grouping Version Migration
    • Group every single record in the catalog from scratch because each record will have a new grouping ID. This would also include any sideloads.
    • The Discovery Partners process will happen overnight and be done during a regular code deployment.  
    • For Marmot,  the code deployment will be done over the Labor Day three day weekend, so it will be less disruptive because it will take a long time to complete the migration process. 
    • All the user-generated data is based on group work IDs so we will need to convert them to group work IDs in the newer system.
    • A version map was created with all the group work IDs to match as many entries as possible.  Unfortunately, it is not possible to match everything.  Group work IDs will exist in user data for things that no longer have a record in the catalog. Without an existing record, it is really hard to map what a group work would be in the new algorithm. 
    • A new manual merging system was created for titles and authors to replace every instance of the Source Grouping Title or Author for any grouped work with that Grouping Title or Author, not just for a single grouped work of interest.
    • Before the version migration, the test servers were updated with user data from production.  If a person had lists or ratings in their personal account at the time of the update, that information will show up in the test server.  
    • At the end of the user data migration process anything that does not match in the database will be deleted because that ID will be meaningless data in the new algorithm.  
    • Reading history is different because the data will still display even if it does not connect to a current work. 
  • Grouped work IDs in test sites are different than production sites,
    • Searching by a grouped work ID from production will not yield results in test due to these changes.
  • Testing suggestions
    • Review your sideloads in your test site for potential grouping issues.
    • As cataloging work is happening, you can compare production against test sites and review the differences in grouping logic.
    • If you have lists and reading history in production associated with accounts, log into your test site and see if those lists are matching with what you expect from production in test. This means the mapping is working correctly. If the mapping is not working, please let the Pika team know.
  • Pascal  included additional details about the record grouping improvements in this documentation
  • Please reference the documentation as you’re reviewing in your test sites
  • Please collect any issues you experience with grouping improvements in your library’s/libraries’ test sites in an email and send to pika@marmot.org 

Q & A Section (Minute 8:09)

Q: What happens with works that have been manually grouped up until now?
A: Hopefully, the grouping improvements will fix the issues that caused the need for the items to be manually grouped together.  The manual groups will not apply because we are dealing with they will have IDs from the previous version so they will no longer match. In fact, the manual merges will be wiped out because they do not represent valid merges.
Q: When the test server is made live, will the group work IDs change?
A: The IDs that exist in the test servers now will be the IDs they will be in production.
Q: If a library has websites that are pointing to group work IDs those will all change?
A: Yes. 
Q: Will you do URL referrals for those libraries that are pointing to group work IDs, or does a library need to update their websites?
A: It is recommended that any group work IDs that are bookmarked they will need to be updated
Q: If you use the URL for the group work ID from the test server that should work when we go to production?
A: Yes, that is correct.
Q: When will the test server go into production?
A: September 1st for the regular deployment.  However, Marmot libraries will need to wait for the Labor Day weekend (September 6 & 7).
Q: Bud Werner put a great deal of time into manually merging travel guides, and wondered if all the effect will be lost?
A: Pascal could try to work on mapping, but since all the IDs are different the entries in the merging table will have IDs that do not exist anymore. Anything that is not available for mapping will be lost.
Q: Is the test site opac3.marmot.org?
A: No, it is your regular test site.
Q: Can you speak about the way the changes to the grouping logic will affect manual unmerges as well?
A: Ungrouping is not an issue for those entries in place since they use bibliographic or record IDs. The ungrouping has been turned off on the test servers so the Pika team can look at it.
Q: Are we still using the record grouping server?
A:  The record grouping server is not being used anymore. All the record grouping algorithms are now in your library’s test site.
Q: So I'm noticing that the binge box art that we have in production is not in test.  We added that art ourselves.  Will we see the art when we migrate?
A: Custom covers are tied to bibliographic records and will be in production.

Next Meeting is Tuesday, September 1, 2020  

 
Meeting Date: 
Tuesday, 2020, August 4
Documentation Type: 
Meeting Minutes
Committees: 
Discovery Committee