Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Explore interplay between Blueprints and Assembler #117

Open
brandonpayton opened this issue Nov 7, 2024 · 16 comments
Open

Explore interplay between Blueprints and Assembler #117

brandonpayton opened this issue Nov 7, 2024 · 16 comments

Comments

@brandonpayton
Copy link
Member

This is an issue to publicly capture the exploration how Blueprints and Assembler can align and interact. Blueprints are files used to create WordPress sites with specific settings, themes, plugins, content, and anything else that can be configured in WordPress. Assembler is a way for users to declare pages and settings for their site by selecting patterns, fonts, and styles.

Note: The Assembler discussed here is incidentally an Automattic-internal prototype for a simple way to create an outline of site structure and styles, but we can still conceptually consider how something like Assembler can inform the design of Blueprints v2.

We began this discussion with considering how Assembler could speak in terms of a Blueprint. I imagined using Assembler as a utility to create a Blueprint or input for a Blueprint. One can imagine that conceptually, but in practice, it turned out to be a bit more difficult.

The Assembler is implemented using block editor primitives and saves changes to WordPress as it progresses through its flow. Because Assembler makes changes to WordPress, we can collect those changes at the end of its flow and incorporate those in a Blueprint, but when we asked whether Assembler could reasonably be made to function without saving changes to WordPress, the answer was No due to the way block editor primitives work. In addition, some folks see Blueprints as a way to create different initial sites before Assembler further configures them.

We started by thinking Assembler would create steps for Blueprints. Others imagined Blueprints simply as a way to preset a site before it is modified by Assembler. Both are practical possibilities.

What should we focus on to get the most out of this exploration?

  • If Blueprints are just a way to prep a site for Assembler, then AFAICT, there is not very much to learn from their interaction because Assembler simply continues a site where a Blueprint left off. There is no direct interaction.
  • If we use Assembler to create content for a Blueprint, there is more of a relationship between the two, and we can consider how the design of Blueprints v2 might incorporate that kind of input.

What are good next steps? What can we prototype?

@adamziel @bgrgicak @mcsf @youknowriad @akirk

@brandonpayton
Copy link
Member Author

What are good next steps? What can we prototype?

I would love to hear others' thoughts on this.

For my part, I think it probably makes sense to see how we can extract a narrowly-scoped site export at the end of an Assembler flow.

@brandonpayton
Copy link
Member Author

As for using Assembler without persisting changes to WordPress, I wonder whether we could use something like Sandbox Site powered by Playground, though sandboxing with an entire site's data may be heavy-handed. (I'm not familiar with the sandbox plugin's implementation and do not know whether this concern is based on reality or not.)

@bgrgicak
Copy link
Collaborator

bgrgicak commented Nov 7, 2024

We started by thinking Assembler would create steps for Blueprints. Others imagined Blueprints simply as a way to preset a site before it is modified by Assembler. Both are practical possibilities.

What about having both? We discussed allowing users to run Blueprints multiple times and it's technically possible.

I wonder whether we could use something like Sandbox Site powered by Playground, though sandboxing with an entire site's data may be heavy-handed.

The goal of that plugin is to extract a live site into Playground, using SQL and file exports.
There is no way to move the changes you made in a Sandbox back to a live site.

The current implementation is experimental and broken, but even when it works it's not scalable to large sites.
When we finish our Data Liberation work, it should be possible to Sandbox large sites.

@brandonpayton
Copy link
Member Author

I wonder whether we could use something like Sandbox Site powered by Playground, though sandboxing with an entire site's data may be heavy-handed.

The goal of that plugin is to extract a live site into Playground, using SQL and file exports. There is no way to move the changes you made in a Sandbox back to a live site.

@bgrgicak Thanks for your feedback on this. My main thought here was the possibility of using an approach like the sandbox plugin to have Assembler work on a "virtual WP" instead of the live site. Then it might be used as a utility that produces output rather than something that always makes changes to the current site.

This is super hand-wavy, but that's the gist.

@bgrgicak
Copy link
Collaborator

bgrgicak commented Nov 8, 2024

My main thought here was the possibility of using an approach like the sandbox plugin to have Assembler work on a "virtual WP" instead of the live site.

That sounds like a great idea. If this is a new site with a small amount of content, we could even use the Sandbox plugin for it.

@mcsf
Copy link

mcsf commented Nov 8, 2024

What should we focus on to get the most out of this exploration?

  • If Blueprints are just a way to prep a site for Assembler, then AFAICT, there is not very much to learn from their interaction because Assembler simply continues a site where a Blueprint left off. There is no direct interaction.
  • If we use Assembler to create content for a Blueprint, there is more of a relationship between the two, and we can consider how the design of Blueprints v2 might incorporate that kind of input.

With the understand that I'm commenting as an outsider to this project: like Bero, I think the two ultimately belong together. As an analogy, the Create Block Theme plugin leverages the Site Editor to create new themes, while, in turn, themes are fed into the Site Editor machine. But if the question is which task to prioritise, I would say that, as far as I understood, the mandate was "Assembler → Blueprint" first.

@brandonpayton
Copy link
Member Author

But if the question is which task to prioritise, I would say that, as far as I understood, the mandate was "Assembler → Blueprint" first.

@bgrgicak and @mcsf Thank you for your feedback here. This was my original understanding as well.

It's been a bit quiet here because I've been AFK, but I'm scheduled to be back on Thursday and plan to check in again then.

Offhand, I think some good initial steps are:

  1. Demonstrate an ability to collect the outcome of the Assembler flow.
  2. Make sure that outcome can be represented and duplicated using a Blueprint
  3. Bonus: If we want to be able to use the Assembler without ultimately changing a site, explore ways to sandbox it. Perhaps one of the following would be good:
    • Use an approach like the Sandbox plugin to capture Assembler output without mutating underlying WP
    • See if there are adequate WP hooks to capture Assembler changes onto some kind of DB overlay.

@adamziel
Copy link
Collaborator

adamziel commented Nov 13, 2024

Looping in @mtias about Assembler -> Blueprint vs Blueprint -> Assembler.

Demonstrate an ability to collect the outcome of the Assembler flow.

One idea here: https://github.com/Emilia-Capital/blueprint-builder

URL rewriting, assets export/import, and the rest of the data liberation machinery seems highly relevant here.

Make sure that outcome can be represented and duplicated using a Blueprint

Highly relevant issue that we may need to solve as a part of this exploration:

I just updated it with a few relevant writeups.

Use an approach like the Sandbox plugin to capture Assembler output without mutating underlying WP

This will only work for small sites today. If that's all it's needed for, that's perfect. For larger sites, we'll need to support partial sync to avoid downloading gigabytes of data just to add a contact page.

See if there are adequate WP hooks to capture Assembler changes onto some kind of DB overlay.

This sounds interesting. I'm not convinced about DB overlay as, at that level, we're starting to reason about raw SQL queries. Security-wise that seems fine, as templates are PHP files anyway so we're already in the code execution world. But I worry about IDs and data conflicts. However, it would be lovely if this general direction of explorations would yield a declarative data bundle that can be imported in a conflict-less way.

@brandonpayton
Copy link
Member Author

This sounds interesting. I'm not convinced about DB overlay as, at that level, we're starting to reason about raw SQL queries.

I have a crazy See-If-It-Can-Be-Done urge related to this and the thorough MySQL parser we're working on. 😄 But really, this is the kind of thing that is better implemented as a DBMS feature. AFAICT, MySQL doesn't have a first-class feature like this, but it reminds me of the Playground Explore Dolt issue and the kinds of things Dolt could be used to do.

@brandonpayton
Copy link
Member Author

I think the discussions of comprehensive export formats and the idea of snapshots are important. But since I'm not so familiar with WP exports today, I'm just going to see how far I can get with a naive approach like: blueprint-as-from-blueprint-builder + WXR import + uploads zip. It would be good to get something working as a prototype and hopefully demonstrate how an approach like this is insufficient.

@brandonpayton
Copy link
Member Author

This sounds interesting. I'm not convinced about DB overlay as, at that level, we're starting to reason about raw SQL queries.

I have a crazy See-If-It-Can-Be-Done urge related to this and the thorough MySQL parser we're working on. 😄 But really, this is the kind of thing that is better implemented as a DBMS feature. AFAICT, MySQL doesn't have a first-class feature like this, but it reminds me of the Playground Explore Dolt issue and the kinds of things Dolt could be used to do.

You can also parse the MySQL bin log if it is enabled and readable, but we cannot count on this for WP because it requires editing MySQL config.

@brandonpayton
Copy link
Member Author

brandonpayton commented Nov 16, 2024

I'm just going to see how far I can get with a naive approach like: blueprint-as-from-blueprint-builder + WXR import + uploads zip. It would be good to get something working as a prototype and hopefully demonstrate how an approach like this is insufficient.

Actually... Will check out the Sandbox plugin approach first, which should be a great place to start for new sites, which is what Assembler targets at the moment.

@brandonpayton
Copy link
Member Author

After more thought, I'm exploring using a zip containing a blueprint at a well-known location and allowing that Blueprint to reference assets within the zip. That should be extremely flexible and useful for Blueprints outside of the export use case.

For a basic WP export, such a package would:

  • Include wp-content/uploads
  • Include a full export to WXR
  • Copy the uploads into place
  • Import the WXR while skipping attachment downloads

There is unfortunately a catch with the current WordPress importer -- attachments are only created during WXR processing if fetching attachments is enabled. We can probably adjust this behavior, but if pressed for a short-term workaround, we could:

  • Copy the zipped wp-content/uploads to a non-standard location
  • Preprocess the WXR attachment URLs to refer to the non-standard location within the local Playground instance
  • Rely on the import process to fetch the media from Playground and create the attachment posts
  • Cleanup: Delete the original media from the non-standard location

@brandonpayton
Copy link
Member Author

note: The blueprints package idea has its own issue now.

@zaerl -- I wanted to loop you in here. Please chime in if you have any feedback on the export-related comment above (or anything else :). We started this issue to consider how work on Assembler** and Blueprints can align in core. At first I thought of this as Assembler generating a Blueprint, but due to way Assembler works, it looks like all we need is a full, self-contained export of the site at the end of the Assembler flow.

I'm not sure there's a lot of interesting work to do related to Assembler and exports, but at least it looks like a place to try the Blueprint + assets idea.

One question I had for you is:

Would it be reasonable to update the WordPress Importer so attachments can be created without fetching the files? If we include uploads within a Blueprint package, we can copy them into place, but currently the WP importer will not create attachments unless fetch_attachments is enabled.

** An Automattic theme that allows users to declare pages and settings for their site by selecting patterns, fonts, and styles in a wizard-like flow.

@zaerl
Copy link
Contributor

zaerl commented Nov 21, 2024

I'm sorry, but I don't understand correctly what we want to do here. Do we want to import media copies from the FS instead of downloading them from somewhere? Or, in other words, create rows in the wp_posts table with post_type "attachment" for populating the media library?

fetch_attachments is disabled by default because it is a heavy operation, so it should be discouraged from starting if a user inadvertently starts it. WP-CLI, for example, is enabled by default when you run wp import (see here). Adjusting that variable before running import() is enough.

Changing the way the wp-importer works is doable. The wp_insert_attachment function is perfectly fine for adding a post, regardless of whether the file has been downloaded. It is okay if it is in the wp_upload_dir().

There are multiple ways to do this. One, for example, is to get the path protocol of the attachment and run wp_safe_remote_get only if needed. This will not be a backward compatibility problem, but it will require some security measures to be in place, such as only accepting the local path from inside the WP folder.

@adamziel adamziel moved this from Inbox to In progress in Playground Board Nov 28, 2024
@brandonpayton
Copy link
Member Author

I'm sorry, but I don't understand correctly what we want to do here. Do we want to import media copies from the FS instead of downloading them from somewhere? Or, in other words, create rows in the wp_posts table with post_type "attachment" for populating the media library?

I'm sorry, @zaerl. I pinged you but didn't follow up on your reply.

Yes, the idea is that such an export package would include media files. It wouldn't necessarily have to be on disk in all cases, but it would have to be accessible and addressable. For example, we might either:

  • completely unpack an export zip that includes media
  • stream the same zip
  • access the same zip piecemeal via Range requests

Does that make sense?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: In progress
Development

No branches or pull requests

5 participants