Convert a URL to a valid and safe filename in NodeJS

A recent web scraping project necessitated creating hundreds of files to store scraped web page contents. The question was how best to name the files for a particular url.

Initially I thought of using the shortid npm package. However, the clients requirement was that he should be able to see easily which web page was parsed in which file. This entailed using the url itself as a source for the filename. After searching for a short while I found the filenamify-url npm package. This allows one to create a valid – unix or Windows style – filename from a url. Details on how to use the package is given below.

Installation is as usual using npm.

npm install --save filenamify-url

Usage is simple enough.

var filenamifyUrl = require('filenamify-url');
 
valid_file = filenamifyUrl('https://www.youtube.com/watch?v=M9ZYEb0Vf8U');
console.log(valid_file);
 
//=> youtube.com!watch!v=M9ZYEb0Vf8U

The url protocol strings are removed and any invalid characters not acceptable in a filename are replaced with the ‘!’ character. You can however specify a replacement character for the default ‘!’ character.

valid_file = filenamifyUrl('https://www.youtube.com/watch?v=M9ZYEb0Vf8U', {replacement: '#'});
console.log(valid_file);
 
//=> youtube.com#watch#v=M9ZYEb0Vf8U

Note: On Unix-like systems the character / is reserved and <>:”/\|?* on Windows.

Once we get a valid filename string, we can prefix or suffix additional information depending on ones requirement – like adding a timestamp and a extension.

valid_file = filenamifyUrl(URL);
final_filename = valid_file + "-" + Date.now() + ".txt";
 
//=> youtube.com!watch!v=M9ZYEb0Vf8U-1526470128602.txt

Reference: http://www.linfo.org/file_name.html

One thought on “Convert a URL to a valid and safe filename in NodeJS

  1. Code Issues 0 Pull requests 0 Insights … Convert a string to a valid safe filename … .gitattributes ยท Es2015ify and require Node.js 4, a year ago … Install. $ npm install filenamify … filenamify-url – Convert a URL to a valid filename; valid-filename

Comments are closed.