#include <Parser.h>
Public Types | |
typedef SBuf::size_type | size_type |
typedef ::Parser::Tokenizer | Tokenizer |
Public Member Functions | |
Parser ()=default | |
Parser (const Parser &)=default | |
Parser & | operator= (const Parser &)=default |
Parser (Parser &&)=default | |
Parser & | operator= (Parser &&)=default |
~Parser () override | |
virtual void | clear ()=0 |
virtual bool | parse (const SBuf &aBuf)=0 |
bool | needsMoreData () const |
virtual size_type | firstLineSize () const =0 |
size in bytes of the first line including CRLF terminator More... | |
size_type | headerBlockSize () const |
size_type | messageHeaderSize () const |
SBuf | mimeHeader () const |
buffer containing HTTP mime headers, excluding message first-line. More... | |
const AnyP::ProtocolVersion & | messageProtocol () const |
the protocol label for this message More... | |
char * | getHostHeaderField () |
const SBuf & | remaining () const |
the remaining unprocessed section of buffer More... | |
Static Public Member Functions | |
static const CharacterSet & | WhitespaceCharacters () |
static const CharacterSet & | DelimiterCharacters () |
Public Attributes | |
Http::StatusCode | parseStatusCode = Http::scNone |
Protected Member Functions | |
void | skipLineTerminator (Tokenizer &) const |
bool | grabMimeBlock (const char *which, const size_t limit) |
Protected Attributes | |
SBuf | buf_ |
bytes remaining to be parsed More... | |
ParseState | parsingStage_ = HTTP_PARSE_NONE |
what stage the parser is currently up to More... | |
AnyP::ProtocolVersion | msgProtocol_ |
what protocol label has been found in the first line (if any) More... | |
SBuf | mimeHeaderBlock_ |
buffer holding the mime headers (if any) More... | |
bool | hackExpectsMime_ = false |
Whether the invalid HTTP as HTTP/0.9 hack expects a mime header block. More... | |
Static Protected Attributes | |
static const SBuf | Http1magic |
RFC 7230 section 2.6 - 7 magic octets. More... | |
Private Member Functions | |
void | cleanMimePrefix () |
void | unfoldMime () |
Detailed Description
HTTP/1.x protocol parser
Works on a raw character I/O buffer and tokenizes the content into the major CRLF delimited segments of an HTTP/1 procotol message:
- first-line (request-line / simple-request / status-line)
- mime-header 0*( header-name ':' SP field-value CRLF)
Member Typedef Documentation
◆ size_type
◆ Tokenizer
Constructor & Destructor Documentation
◆ Parser() [1/3]
|
default |
◆ Parser() [2/3]
|
default |
◆ Parser() [3/3]
|
default |
◆ ~Parser()
Member Function Documentation
◆ cleanMimePrefix()
|
private |
Remove invalid lines (if any) from the mime prefix
RFC 7230 section 3: "A recipient that receives whitespace between the start-line and the first header field MUST ... consume each whitespace-preceded line without further processing of it."
We need to always use the relaxed delimiters here to prevent line smuggling through strict parsers.
Note that 'whitespace' in RFC 7230 includes CR. So that means sequences of CRLF will be pruned, but not sequences of bare-LF.
Definition at line 97 of file Parser.cc.
References Http::One::CrLf(), CharacterSet::LF, LineCharacters(), and RelaxedDelimiterCharacters().
◆ clear()
|
pure virtual |
Set this parser back to a default state. Will DROP any reference to a buffer (does not free).
Implemented in Http::One::RequestParser, Http::One::ResponseParser, and Http::One::TeChunkedParser.
Definition at line 27 of file Parser.cc.
References buf_, SBuf::clear(), Http::One::HTTP_PARSE_NONE, mimeHeaderBlock_, msgProtocol_, parsingStage_, and Ftp::ProtocolVersion().
◆ DelimiterCharacters()
|
static |
Whitespace between protocol elements in restricted contexts like request line, status line, asctime-date, and credentials Seen in RFCs as SP but may be "relaxed" by us. See also: WhitespaceCharacters(). XXX: Misnamed and overused.
Definition at line 59 of file Parser.cc.
References Config, SquidConfig::onoff, SquidConfig::relaxed_header_parser, RelaxedDelimiterCharacters(), and CharacterSet::SP.
Referenced by Http::ContentLengthInterpreter::goodSuffix(), and Http::One::ResponseParser::ParseResponseStatus().
◆ firstLineSize()
|
pure virtual |
Implemented in Http::One::RequestParser, Http::One::ResponseParser, and Http::One::TeChunkedParser.
Referenced by messageHeaderSize().
◆ getHostHeaderField()
char * Http::One::Parser::getHostHeaderField | ( | ) |
Scan the mime header block (badly) for a Host header.
BUG: omits lines when searching for headers with obs-fold or multiple entries.
BUG: limits output to just 1KB when Squid accepts up to 64KB line length.
- Returns
- A pointer to a field-value of the first matching field-name, or NULL.
Definition at line 213 of file Parser.cc.
References CharacterSet::ALPHA, SBuf::caseCmp(), SBuf::chop(), SBuf::consume(), Http::One::CrLf(), debugs, CharacterSet::DIGIT, SBuf::findFirstNotOf(), GET_HDR_SZ, CharacterSet::LF, LineCharacters(), LOCAL_ARRAY, SBuf::npos, SBufToCstring(), SBuf::substr(), SBuf::trim(), and CharacterSet::WSP.
◆ grabMimeBlock()
|
protected |
Scan to find the mime headers block for current message.
- Return values
-
true If mime block (or a blocks non-existence) has been identified accurately within limit characters. mimeHeaderBlock_ has been updated and buf_ consumed. false An error occurred, or no mime terminator found within limit.
Definition at line 157 of file Parser.cc.
References debugs, headersEnd(), Http::One::HTTP_PARSE_DONE, AnyP::PROTO_HTTP, AnyP::PROTO_ICY, and Http::scHeaderTooLarge.
◆ headerBlockSize()
|
inline |
size in bytes of the message headers including CRLF terminator(s) but excluding first-line bytes
Definition at line 73 of file Parser.h.
References SBuf::length(), and mimeHeaderBlock_.
Referenced by messageHeaderSize(), and Http::Message::parseHeader().
◆ messageHeaderSize()
|
inline |
size in bytes of HTTP message block, includes first-line and mime headers excludes any body/entity/payload bytes excludes any garbage prefix before the first-line
Definition at line 78 of file Parser.h.
References firstLineSize(), and headerBlockSize().
Referenced by Http::Message::parseHeader().
◆ messageProtocol()
|
inline |
Definition at line 84 of file Parser.h.
References msgProtocol_.
◆ mimeHeader()
|
inline |
Definition at line 81 of file Parser.h.
References mimeHeaderBlock_.
Referenced by Http::Message::parseHeader().
◆ needsMoreData()
|
inline |
Whether the parser is waiting on more data to complete parsing a message. Use to distinguish between incomplete data and error results when parse() returns false.
Definition at line 66 of file Parser.h.
References Http::One::HTTP_PARSE_DONE, and parsingStage_.
Referenced by ConnStateData::handleChunkedRequestBody(), TestHttp1Parser::testDripFeed(), TestHttp1Parser::testParserConstruct(), and testResults().
◆ operator=() [1/2]
◆ operator=() [2/2]
◆ parse()
|
pure virtual |
attempt to parse a message from the buffer
- Return values
-
true if a full message was found and parsed false if incomplete, invalid or no message was found
Implemented in Http::One::TeChunkedParser, Http::One::RequestParser, and Http::One::ResponseParser.
◆ remaining()
|
inline |
Definition at line 98 of file Parser.h.
References buf_.
Referenced by HttpStateData::decodeAndWriteReplyBody(), ConnStateData::handleChunkedRequestBody(), and TestHttp1Parser::testDripFeed().
◆ skipLineTerminator()
|
protected |
detect and skip the CRLF or (if tolerant) LF line terminator consume from the tokenizer.
- Exceptions
-
exception on bad or InsufficientInput
Definition at line 66 of file Parser.cc.
References Config, Http::One::CrLf(), CharacterSet::LF, SquidConfig::onoff, and SquidConfig::relaxed_header_parser.
◆ unfoldMime()
|
private |
Replace obs-fold with a single SP,
RFC 7230 section 3.2.4 "A server that receives an obs-fold in a request message that is not within a message/http container MUST ... replace each received obs-fold with one or more SP octets prior to interpreting the field value or forwarding the message downstream."
"A proxy or gateway that receives an obs-fold in a response message that is not within a message/http container MUST ... replace each received obs-fold with one or more SP octets prior to interpreting the field value or forwarding the message downstream."
Definition at line 132 of file Parser.cc.
References CharacterSet::CR, CharacterSet::LF, CharacterSet::rename(), SBuf::substr(), and CharacterSet::WSP.
◆ WhitespaceCharacters()
|
static |
Whitespace between regular protocol elements. Seen in RFCs as OWS, RWS, BWS, SP/HTAB but may be "relaxed" by us. See also: DelimiterCharacters().
Definition at line 52 of file Parser.cc.
References Config, SquidConfig::onoff, SquidConfig::relaxed_header_parser, RelaxedDelimiterCharacters(), and CharacterSet::WSP.
Referenced by Http::ContentLengthInterpreter::findDigits(), and Http::One::ParseBws().
Member Data Documentation
◆ buf_
|
protected |
Definition at line 146 of file Parser.h.
Referenced by clear(), remaining(), TestHttp1Parser::testParserConstruct(), and testResults().
◆ hackExpectsMime_
◆ Http1magic
|
staticprotected |
Definition at line 143 of file Parser.h.
Referenced by Http::One::ResponseParser::firstLineSize().
◆ mimeHeaderBlock_
|
protected |
Definition at line 155 of file Parser.h.
Referenced by clear(), headerBlockSize(), and mimeHeader().
◆ msgProtocol_
|
protected |
Definition at line 152 of file Parser.h.
Referenced by Http::One::TeChunkedParser::TeChunkedParser(), clear(), Http::One::ResponseParser::firstLineSize(), Http::One::RequestParser::http0(), messageProtocol(), TestHttp1Parser::testParserConstruct(), and testResults().
◆ parseStatusCode
Http::StatusCode Http::One::Parser::parseStatusCode = Http::scNone |
HTTP status code resulting from the parse process. to be used on the invalid message handling.
Http::scNone indicates incomplete parse, Http::scOkay indicates no error, other codes represent a parse error.
Definition at line 108 of file Parser.h.
Referenced by TestHttp1Parser::testParserConstruct(), and testResults().
◆ parsingStage_
|
protected |
Definition at line 149 of file Parser.h.
Referenced by clear(), needsMoreData(), TestHttp1Parser::testParserConstruct(), and testResults().
The documentation for this class was generated from the following files: