Project Scope and ToDos
- Take a link and turn it into an oEmbed/Open Graph style share card
- Take a link and archive it in the most reliable way
- When the link is a tweet, display the tweet but also the whole tweet thread.
- When the link is a tweet, archive the tweets, and display them if the live ones are not available.
- Capture any embedded retweets in the thread. Capture their thread if one exists
- Capture any links in the Tweet
- Create the process as an abstract function that returns the data in a savable way
- Archive links on Archive.org and save the resulting archival links
- Create link IDs that can be used to cache related content
- Integrate it into the site to be able to make context pages here.
Day 3
Ok, yesterday I was trying to knock down the oEmbed process from Facebook and getting nothing. Let's take this back to base principles and see if I can make a request outside of Node that gets what I need
Ok, it looks like I don't have the right permissions for my Facebook app? Sort of taking the o out of oEmbed if I need an app, permissions and a key isn't it Facebook?
Ok, to get the oEmbed process working I need to have my App verified on Facebook... which means uploading a photo of my government provided ID? Nope, fk that. Ok, just no Facebook oembeds in this process then.
Ok, let's grab the page data that tells us about a post now. To do that, I'm going to use a classic package I've done some work in before: JSDOM.
JSDOM can do its own requests, but I would prefer to handle that as a separate step.
First I'm going to build a basic object that can contain data about the page that should be useful. I want to predefine a few namespaces I would use. Let's pull the standard stuff from the meta tags and JSON-LD. I can also use Dublin Core potentially. I can also use h-card perhaps or h-entry? We can try that out at some later point.
Ok, so once I have the DOM set up how can I grab the data I need?
On the DOM object I can execute window.document.getElementsByTagName("meta");
and get a list back. Interestingly tags using the name
property are accessible on the resulting object by name. For OpenGraph we can use a wildcard search of querySelectorAll
.
const openGraphNodes = window.document.querySelectorAll(
"meta[property^='og:']"
);
Ok, so I need to set up some tests to make sure it is working as expected.
Can I use to.equal
in mocha?
result.metadata.keyvalues.equal([
"jekyll",
"social-media",
]);
Apparently not.
Ok, did some searching around and it looks like the right way to handle this:
expect(result.metadata.keyvalues).to.have.members([
"jekyll",
"social-media",
])
Ok, things are working. But I think I can make this better code by simplifying and abstracting the functions around the querySelector
. There OpenGraph and Twitter based meta values are all based on RDF and we can analyze them in a similar way.
const pullMetadataFromRDFProperty = (documentObj, topNode) => {
const graphNodes = documentObj.querySelectorAll(
`meta[property^='${topNode}:']`
);
const openGraphObject = Array.from(graphNodes).reduce((prev, curr) => {
const keyValue = curr.attributes
.item(0)
.nodeValue.replace(`${topNode}:`, "");
if (prev.hasOwnProperty(keyValue)) {
const lastValue = prev[keyValue];
if (Array.isArray(lastValue)) {
prev[keyValue].push(curr.content);
} else {
prev[keyValue] = [lastValue, curr.content];
}
} else {
prev[keyValue] = curr.content;
}
return prev;
}, {});
// console.log("openGraphObject", openGraphObject);
return openGraphObject;
};
git commit -am "Setting up scrape of OpenGraph data and supporting unit tests"
Now I can use this function to capture the Twitter metadata as well!
git commit -am "Setting up scrape of twitter data"
Oh wait, I need to account for the fact that some tags are using name
and some are using property
.
git commit -am "Fix pullMetadataFromRDFProperty to have a prop type"
A few more modifications and I can get it to capture DublinCore if available as well.
I can even build some tests to prove some negative cases. That should be useful for more comprehensive testing.
Basically this should allow me to compose a bunch of different tests with different HTML.
git commit -am "More extensive test coverage"
Looking good. Now I want to test it end to end.
describe("should create link objects from a domain requests", function () {
this.timeout(5000);
it("should resolve a basic URL", async function () {
const result = await linkModule.getLinkData({
sanitizedLink:
"http://aramzs.github.io/jekyll/social-media/2015/11/11/be-social-with-jekyll.html",
link: "http://aramzs.github.io/jekyll/social-media/2015/11/11/be-social-with-jekyll.html",
});
result.status.should.equal(200);
result.metadata.title.should.equal(
"How to make your Jekyll site show up on social"
);
result.metadata.author.should.equal("Aram Zucker-Scharff");
result.metadata.description.should.equal(
"Here's how to make Jekyll posts easier for others to see and share on social networks."
);
result.metadata.canonical.should.equal(
"http://aramzs.github.io/jekyll/social-media/2015/11/11/be-social-with-jekyll.html"
);
expect(result.metadata.keywords).to.have.members([
"jekyll",
"social-media",
]);
result.opengraph.title.should.equal(
"How to make your Jekyll site show up on social"
);
result.opengraph.locale.should.equal("en_US");
result.opengraph.description.should.equal(
"Here's how to make Jekyll posts easier for others to see and share on social networks."
);
result.opengraph.url.should.equal(
"http://aramzs.github.io/jekyll/social-media/2015/11/11/be-social-with-jekyll.html"
);
result.twitter.card.should.equal("summary_large_image");
result.twitter.creator.should.equal("@chronotope");
result.twitter.title.should.equal(
"How to make your Jekyll site show up on social"
);
result.twitter.image.should.equal(
"https://raw.githubusercontent.com/AramZS/aramzs.github.io/master/_includes/tumblr_nwncf1T2ht1rl195mo1_1280.jpg"
);
result.dublinCore.Format.should.equal("video/mpeg; 10 minutes");
result.dublinCore.Language.should.equal("en");
result.dublinCore.Publisher.should.equal("publisher-name");
result.dublinCore.Title.should.equal("HYP");
result.jsonLd["@type"].should.equal("BlogPosting");
result.jsonLd.headline.should.equal(
"How to make your Jekyll site show up on social"
);
result.jsonLd.description.should.equal(
"Here's how to make Jekyll posts easier for others to see and share on social networks."
);
expect(result.jsonLd.image).to.have.members([
"https://raw.githubusercontent.com/AramZS/aramzs.github.io/master/_includes/tumblr_nwncf1T2ht1rl195mo1_1280.jpg",
]);
});
});
Oh, I forgot, I need to await response.text()
!
Ok, a few more tweaks and a reminder that I don't have Dublin Core on my actual site and it should be good to go.
git commit -am "End to end unit test for building a link object"
Now I have a good looking data object I can use to build context cards:
{
originalLink: 'http://aramzs.github.io/jekyll/social-media/2015/11/11/be-social-with-jekyll.html',
sanitizedLink: 'http://aramzs.github.io/jekyll/social-media/2015/11/11/be-social-with-jekyll.html',
oembed: false,
jsonLd: {
'@type': 'BlogPosting',
headline: 'How to make your Jekyll site show up on social',
description: "Here's how to make Jekyll posts easier for others to see and share on social networks.",
image: [
'https://raw.githubusercontent.com/AramZS/aramzs.github.io/master/_includes/tumblr_nwncf1T2ht1rl195mo1_1280.jpg'
],
mainEntityOfPage: {
'@type': 'WebPage',
'@id': 'http://aramzs.github.io/jekyll/social-media/2015/11/11/be-social-with-jekyll.html'
},
datePublished: '2015-11-11 10:34:51 -0500',
dateModified: '2015-11-11 10:34:51 -0500',
isAccessibleForFree: 'True',
isPartOf: {
'@type': [ 'CreativeWork', 'Product', 'Blog' ],
name: 'Fight With Tools',
productID: 'aramzs.github.io'
},
discussionUrl: false,
license: 'http://creativecommons.org/licenses/by-sa/4.0/',
author: {
'@type': 'Person',
name: 'Aram Zucker-Scharff',
description: 'Aram Zucker-Scharff is Director for Ad Engineering at Washington Post, lead dev for PressForward and a consultant. Tech solutions for journo problems.',
sameAs: 'http://aramzs.github.io/aramzs/',
image: {
'@type': 'ImageObject',
url: 'https://raw.githubusercontent.com/AramZS/aramzs.github.io/master/_includes/Aram-Zucker-Scharff-square.jpg'
},
givenName: 'Aram',
familyName: 'Zucker-Scharff',
alternateName: 'AramZS',
publishingPrinciples: 'http://aramzs.github.io/about/'
},
publisher: {
'@type': 'Organization',
name: 'Fight With Tools',
description: "A site discussing how to imagine, build, analyze and use cool code and web tools. Better websites, better stories, better developers. Technology won't save the world, but you can.",
sameAs: 'http://aramzs.github.io',
logo: {
'@type': 'ImageObject',
url: 'https://41.media.tumblr.com/709bb3c371b9924add351bfe3386e946/tumblr_nxdq8uFdx81qzocgko1_1280.jpg'
},
publishingPrinciples: 'http://aramzs.github.io/about/'
},
editor: {
'@type': false,
name: false,
description: false,
sameAs: false,
image: { '@type': false, url: false },
givenName: false,
familyName: false,
alternateName: false,
publishingPrinciples: false
},
'@context': 'http://schema.org'
},
status: 200,
metadata: {
author: 'Aram Zucker-Scharff',
title: 'How to make your Jekyll site show up on social',
description: "Here's how to make Jekyll posts easier for others to see and share on social networks.",
canonical: 'http://aramzs.github.io/jekyll/social-media/2015/11/11/be-social-with-jekyll.html',
keywords: [ 'jekyll', 'social-media' ]
},
dublinCore: {},
opengraph: {
title: 'How to make your Jekyll site show up on social',
description: "Here's how to make Jekyll posts easier for others to see and share on social networks.",
url: 'http://aramzs.github.io/jekyll/social-media/2015/11/11/be-social-with-jekyll.html',
site_name: 'Fight With Tools by AramZS',
locale: 'en_US',
type: 'article',
typeObject: {
published_time: '2015-11-11 10:34:51 -0500',
modified_time: false,
author: 'http://facebook.com/aramzs',
publisher: 'https://www.facebook.com/aramzs',
section: 'Code',
tag: [ 'jekyll', 'social-media' ]
},
image: 'https://raw.githubusercontent.com/AramZS/aramzs.github.io/master/_includes/tumblr_nwncf1T2ht1rl195mo1_1280.jpg'
},
twitter: {
site: '@chronotope',
description: "Here's how to make Jekyll posts easier for others to see and share on social networks.",
card: 'summary_large_image',
creator: '@chronotope',
title: 'How to make your Jekyll site show up on social',
image: 'https://raw.githubusercontent.com/AramZS/aramzs.github.io/master/_includes/tumblr_nwncf1T2ht1rl195mo1_1280.jpg'
}
}