Make README markdown
This commit is contained in:
parent
eb18f9b698
commit
c0846a9b52
56
README
56
README
@ -1,56 +0,0 @@
|
|||||||
This document is intended for people looking to contribute to Ikibooru. For Ikibooru users, check the official webpage: https://mid.net.ua/ikibooru.html
|
|
||||||
|
|
||||||
Ikibooru uses Pegasus.lua as a barebones HTTP library, despite the latter's intention to be larger in scope. This is because it fails at this task. As such, Ikibooru's version of Pegasus.lua is also modified to include additional features:
|
|
||||||
|
|
||||||
1. Support for multipart/form-data requests (parseMultipartFormData @ pegasus/request.lua);
|
|
||||||
2. set FD_CLOEXEC on each socket using luaposix;
|
|
||||||
3. more specific logging;
|
|
||||||
4. removed a really stupid feature where it tries to write to a response already written, if the status code is 404.
|
|
||||||
(UNDONE) 5. Request handlers are placed in coroutines and run in a round-robin fashion, so as to make sure one request or response wouldn't block other users
|
|
||||||
|
|
||||||
The first is necessary for file uploads to be possible.
|
|
||||||
|
|
||||||
The second is necessary as Ikibooru calls upon the shell for a few things, such as generating a thumbnail or sending an e-mail. Otherwise child processes inherit open file descriptors, which causes clients to wait until any internal processes end.
|
|
||||||
|
|
||||||
Yes, it calls upon the shell. This quirk is why Ikibooru currently supports only Unices running Bourne-like shells. The offending lines appear in smtpauth.lua and db.lua. Protection against command injection is done by wrapping potentially untrusted input with single quotes, the contents of which the shell should interpret literally. Single quotes are beforehand turned into '"'"'. Behaviour should be equivalent to shlex.quote from the Python standard library. This on paper should achieve perfect command injection prevention, but more testing is required. More on this later.
|
|
||||||
|
|
||||||
The aptly named html.lua handles formatting such as escaping or unescaping, and so must also be tested for security. The exception is SQL escaping which is done by LuaSQL (which wraps mysql_real_escape_string).
|
|
||||||
|
|
||||||
db.lua handles all SQL access and filesystem access for storing individual object files.
|
|
||||||
|
|
||||||
db.lua also handles valid formats for usernames, object names, emails, etc. For example, object names are not permitted to be CON, PRN, AUX, LPT1...LPT9, etc., due to these names being reserved by Windows (testobjfilename @ db.lua). It also makes sure a filename does not contain slashes, backslashes, unprintable ASCII characters, any other reserved filenames, may not consist only of dots, etc. These should make it more difficult to perform any command injection.
|
|
||||||
|
|
||||||
Maybe valid names should be handled by html.lua?
|
|
||||||
|
|
||||||
Ikibooru encodes object IDs using AES-128-ECB so as to hide their chronology. To do this it generates a random 16-byte string during installation, which it will use as a key forever. The security of object IDs is minor and its disclosure isn't as catastrophic as, say, the bypassing of user verification. The use of ECB is deliberate as it does not require an IV, and having unique plaintext-ciphertext pairs is useful. In other words, AES is not used in such a way that the choice of ECB will be detrimental to security. However, it is extremely important this key is not lost or changed, otherwise all external links to objects will be suddenly invalidated.
|
|
||||||
|
|
||||||
db.lua makes the distinction of virtual and physical filenames (terminology borrowed from the virtual and physical memory of computer architectures). Virtual filenames are those shown to the end user (/objd/ABCD1234..ABCD1234), whereas physical filenames are those used internally (objd/AB/CD/12/34/../AB/CD/12/34). Ikibooru is designed to organize the filesystem in such a manner so as to minimize any penalties from having too many entries in any one directory.
|
|
||||||
|
|
||||||
main.lua is where the server begins running. All requests caught by Pegasus.lua are passed to the callback function. Within it, a few special paths are passed to specific codepaths, and the rest go to the template system. Make no mistake, the templates do more than formatting a page. They may also handle and call to perform serious operations.
|
|
||||||
|
|
||||||
Note, that the generic template path is wrapped with a pcall to catch errors. This lets the server keep running, should anything occur. The same is not done for the special paths, such as /verif, /addobj, etc. Therefore, making those parts extra error-resistant is more important.
|
|
||||||
|
|
||||||
Ikibooru uses Lyre to compile templates, and its done once before Pegasus.lua is first initialized. Prior to passing to Lyre, Ikibooru preprocesses lines starting with {# to implement a basic "inclusion" system. Currently, all templates at the end include base.inc, which grants them a standard layout with a header, sign-in/out button, etc.
|
|
||||||
|
|
||||||
BigGlobe handles the main configuration. The name is composed of Big (which means "great" or "powerful") and Globe (which means global). The name has nothing to do with flat-earthers nor globe-earthers.
|
|
||||||
|
|
||||||
Sessions are done client-side, similarly to how JWT works. A verification link is sent to the e-mail address, which contains the session token. The token is composed of a few pieces of data concatenated with their HMAC-SHA256 digest. The session key used for the HMAC is 64 bytes long and is generated per each server launch. It is good practice to restart the server every now and then to invalidate all sessions, also rendering nil any efforts to bruteforce the session key. Assuming perfect security, "every now and then" means every few thousand years, but let us say every few months just in case.
|
|
||||||
|
|
||||||
Notice that only an HMAC is used. There is no encryption, which means no secrecy, only integrity. So far this has been good enough.
|
|
||||||
|
|
||||||
While Ikibooru can run on just about any Unix-like system, an automatic installer exists only for Linux. More would be welcome.
|
|
||||||
|
|
||||||
obje.html.l is arguably the ugliest part of the entire program, in part due to the unreadable JS code within. The problem is because Pegasus.lua deals with only one request at a time. Should a large file be uploaded (say 50MB), it will effectively block the server from doing anything else. This is unacceptable, so JS is used to upload files in chunks of 512KB. Requests from other users inbetween can then be serviced. Now obje does the following upon submitting a form:
|
|
||||||
1. Send a POST request to delete appropriately marked files
|
|
||||||
2. If there are new files, send a POST request to create empty files of specific sizes (to make sure such sizes are allowed in the first place)
|
|
||||||
2.5. Send a POST request for each 512KB chunk of every file, which the server then places at the appropriate offset
|
|
||||||
3. Allow the rest of the form to be submitted naturally
|
|
||||||
What makes the code unreadable (IMO) is the asynchronous nature. It's hard to tell the overall control flow what with all of the onreadystatuschange events.
|
|
||||||
|
|
||||||
Later on I realized that large responses (file downloads) block the server, also. Don't know why it took so long, but that's what lead to change no. 5 of Pegasus.lua. I still kept the uploading JS code in obje.html.l, because the page needed JS before that, anyway, and it provided better UX.
|
|
||||||
|
|
||||||
Unfortunately, mod no. 5 cut down processing speed to around 66% of what it was without, and considering that Ikibooru is already meant to be used together with a reverse proxy, since that proxy likely supports accelerated static file serving (X-Sendfile or X-Accel-Redirect) the mod was ultimately undone. Perhaps it should be made optional?
|
|
||||||
|
|
||||||
Times are stored in the database as DATETIMEs in the UTC time zone. I don't remember why this was done. I suppose because the TIMESTAMP datatype (which uses UTC automatically) breaks after 2038, whereas DATETIME can go upto the year 9999. The server passes to browsers times in UTC (strings directly from the DB engine), and then clientside JS code proceeds to shift them into the the user's appropriate local timezone (static/datetimes.js). This means that should JS be disabled, all times will be displayed as UTC.
|
|
||||||
|
|
||||||
main.lua and core.lua are split to later separate Ikibooru the "service", from Ikibooru the "server". Some reverse proxies can create Lua environments of their own and have Ikibooru attached to them directly (skipping any inter-server communication overhead), That way, Ikibooru wouldn't have to run as an HTTP server at all, but this isn't supported yet.
|
|
74
README.md
Normal file
74
README.md
Normal file
@ -0,0 +1,74 @@
|
|||||||
|
This document is intended for people looking to study and/or contribute to Ikibooru. For Ikibooru users, check the official webpage: https://mid.net.ua/ikibooru.html
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
While Ikibooru can run on just about any Unix-like system, an automatic installer exists only for Linux. I intend to create some for the BSDs, and maybe Windows.
|
||||||
|
|
||||||
|
Ikibooru uses Pegasus.lua as a barebones HTTP library, despite the latter's intention to be larger in scope. This is because Pegasus.lua leaves a lot to be desired. As such, Ikibooru's version of Pegasus.lua is also modified to include additional features:
|
||||||
|
|
||||||
|
1. Support for multipart/form-data requests (parseMultipartFormData @ pegasus/request.lua);
|
||||||
|
2. set FD_CLOEXEC on each socket using luaposix;
|
||||||
|
3. more specific logging;
|
||||||
|
4. removed a really stupid feature where it tries to write to a response already written, if the status code is 404.
|
||||||
|
5. ~~Request handlers are placed in coroutines and run in a round-robin fashion, so as to make sure one request or response wouldn't block other users~~
|
||||||
|
|
||||||
|
The first is necessary for file uploads to be possible.
|
||||||
|
|
||||||
|
The second is necessary as Ikibooru calls upon the shell for a few things, such as generating a thumbnail or sending an e-mail. Otherwise child processes inherit open file descriptors, which causes clients to wait until any internal processes end.
|
||||||
|
|
||||||
|
Yes, it calls upon the shell. This quirk is why Ikibooru currently supports only Unices running Bourne-like shells. The offending lines appear in `smtpauth.lua` and `db.lua`. Protection against command injection is done by wrapping potentially untrusted input with single quotes, the contents of which the shell should interpret literally. Single quotes are beforehand turned into `'"'"'`. Behaviour should be equivalent to shlex.quote from the Python standard library. This on paper should achieve perfect command injection prevention, but more testing is required.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
Ikibooru hides object IDs using AES-128-ECB so as to mask their chronology, and prevent bots from going up every object starting from ID 1. To do this it generates a random 16-byte string during installation, which it will use as a permanent key. The security of object IDs is minor and its disclosure isn't as catastrophic as, say, the bypassing of user verification.
|
||||||
|
|
||||||
|
The use of ECB is deliberate as it does not require an IV, and having unique plaintext-ciphertext pairs is useful. In other words, AES is not used in such a way that the choice of ECB will be detrimental to security. However, it is extremely important this key is not lost or changed, otherwise all external links to objects will be suddenly invalidated.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
The aptly named `html.lua` handles formatting such as escaping or unescaping, and so must also be tested for security. The exception is SQL escaping which is done by LuaSQL (which wraps `mysql_real_escape_string`).
|
||||||
|
|
||||||
|
`db.lua` handles all SQL access and filesystem access for storing individual object files.
|
||||||
|
|
||||||
|
`db.lua` also handles valid formats for usernames, object names, emails, etc. For example, object names are not permitted to be `CON`, `PRN`, `AUX`, `LPT1`...`LPT9`, etc., due to these names being reserved by Windows (`testobjfilename` @ `db.lua`). It also makes sure a filename does not contain slashes, backslashes, unprintable ASCII characters, any other reserved filenames, may not consist only of dots, etc. These should make it more difficult to perform any command injection.
|
||||||
|
|
||||||
|
`db.lua` makes the distinction of virtual and physical filenames (terminology borrowed from the virtual and physical memory of computer architectures). Virtual filenames are those shown to the end user (`/objd/ABCD1234..ABCD1234`), whereas physical filenames are those used internally (`objd/AB/CD/12/34/../AB/CD/12/34`). This is done to minimize any penalties from having too many entries in any one directory.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
`main.lua` is where the server begins running. All requests caught by Pegasus.lua are passed to the callback function. Within it, a few special paths are passed to specific codepaths, and the rest go to the template system. Make no mistake, the templates do more than formatting a page. They may also handle and call to perform serious operations.
|
||||||
|
|
||||||
|
Note that the generic template path is wrapped with `pcall` to catch errors. This lets the server keep running, should anything occur. The same is not done for the special paths, such as /verif, /addobj, etc.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
Ikibooru uses Lyre to compile templates, and its done once before Pegasus.lua is first initialized. Prior to passing to Lyre, Ikibooru preprocesses lines starting with `{#` to implement a basic inclusion system. Currently, all templates include `base.inc`, which grants them a standard layout with a header, sign-in/out button, etc.
|
||||||
|
|
||||||
|
BigGlobe handles the main configuration. The name is composed of Big (which means "great" or "powerful") and Globe (which means global). The name has nothing to do with flat-earthers nor globe-earthers.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
Sessions are done client-side, similarly to how JWT works. A verification link is sent to the e-mail address, which contains the session token. The token is composed of a few pieces of data concatenated with their HMAC-SHA256 digest. The session key used for the HMAC is 64 bytes long and is generated per each server launch. It is good practice to restart the server every now and then to invalidate all sessions, also rendering nil any efforts to bruteforce the session key. Assuming perfect security, "every now and then" means every few thousand years, but let us say every few months just in case.
|
||||||
|
|
||||||
|
Only an HMAC is used. There is no encryption, which means no secrecy, only integrity. So far this has been good enough.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
`obje.html.l` is arguably the ugliest part of the entire program, in part due to the unreadable JS code within. The problem is that Lua alone is single-threaded. Should a large file be uploaded (say 50MB), it will effectively block the server from doing anything else. This is unacceptable, so JS is used to upload files in chunks of 512KB. Requests from other users inbetween can then be serviced. So `obje` does the following upon submitting a form:
|
||||||
|
|
||||||
|
1. Send a POST request to delete appropriately marked files
|
||||||
|
2. If there are new files, send a POST request to create empty files of specific sizes (to make sure such sizes are allowed in the first place)
|
||||||
|
2.5. Send a POST request for each 512KB chunk of every file, which the server then places at the appropriate offset
|
||||||
|
3. Allow the rest of the form to be submitted naturally
|
||||||
|
|
||||||
|
Later on I realized that large responses (file downloads) block the server, also. Don't know why it took so long, but that's what lead to change no. 5 of Pegasus.lua. I still kept the uploading JS code in `obje.html.l`, because the page needed JS before that, anyway, and it provided better UX.
|
||||||
|
|
||||||
|
Unfortunately, mod no. 5 cut down processing speed to around 66% of what it was without, and considering that Ikibooru is already meant to be used together with a reverse proxy, since that proxy likely supports accelerated static file serving (`X-Sendfile` or `X-Accel-Redirect`) the mod was ultimately undone.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
Times are stored in the database as `DATETIME`s in the UTC time zone. I don't remember why this was done. I suppose because the `TIMESTAMP` datatype (which uses UTC automatically) breaks after 2038, whereas `DATETIME` can go upto the year 9999. The server passes to browsers times in UTC (strings directly from the DB engine), and then clientside JS code proceeds to shift them into the the user's appropriate local timezone (`static/datetimes.js`). This means that should JS be disabled, all times will be displayed as UTC.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
`main.lua` and `core.lua` are split to later separate Ikibooru the "service", from Ikibooru the "server". Some reverse proxies can create Lua environments of their own and have Ikibooru attached to them directly (skipping any inter-server communication overhead), That way, Ikibooru wouldn't have to run as an HTTP server at all, but this isn't supported yet.
|
Loading…
Reference in New Issue
Block a user