SwiftUI 通过 XPath 显示网站信息

首先需要 Kanna 第三方库解析 HTML。

解析完成后,再根据 XPath 找到指定想要展示的内容进行展示。这里展示网站的名称和 favicon。

代码如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
import SwiftUI
import Kanna

struct ContentView: View {

@State private var imageUrl: String = ""
@State private var webTitle: String = ""

var body: some View {
VStack {

Text(webTitle)

AsyncImage(url: URL(string: imageUrl)) { image in
image
.resizable()
.scaledToFit()
.frame(width: 40, height: 40)
.clipped()
.cornerRadius(4)
} placeholder: {
ProgressView()
}

}
.onAppear {

let url = URL(string: "https://wonderhoi.com")

var appleTouchIconString: String?

if UIApplication.shared.canOpenURL(url!) {

let task = URLSession.shared.dataTask(with: url!) { data, response, error in
guard error == nil else {
print(error!)
return
}
guard let data = data else {
print("data is nil")
return
}
guard let html = String(data: data, encoding: .utf8) else {
print("the response is not in UTF-8")
return
}
if let doc = try? HTML(html: html, encoding: .utf8) {

webTitle = doc.title ?? "None"

for appleTouchIcon in doc.xpath("//meta[@rel = 'apple-touch-icon']/@content | //link[@rel = 'apple-touch-icon']/@href | //meta[@rel = 'apple-touch-icon']/@href | //link[@rel = 'apple-touch-icon']/@content") {
appleTouchIconString = appleTouchIcon.text

let iconUrl = URL(string: appleTouchIconString!)

if UIApplication.shared.canOpenURL(iconUrl!) {

let task = URLSession.shared.dataTask(with: iconUrl!) { data, response, error in
guard error == nil else {
print(error!)
return
}
guard let data = data else {
print("data is nil")
return
}
guard let image = UIImage(data: data) else {
print("no picture")
return
}

imageUrl = appleTouchIconString!

}
task.resume()
}
}
}
}
task.resume()
}
}
}
}

其中,XPath 语句

1
//meta[@rel = 'apple-touch-icon']/@content | //link[@rel = 'apple-touch-icon']/@href | //meta[@rel = 'apple-touch-icon']/@href | //link[@rel = 'apple-touch-icon']/@content

//meta[@rel = 'apple-touch-icon']/@content 为例:

  • **//**:选中节点的标记符号
  • meta:节点的标记名称
  • **@**:选中属性的标记符号
  • rel:节点属性的名称
  • **/@**:提取当前路径下的属性值

参考:

  1. XPath在python中的高级应用
  2. selenium之xpath语法总结
  3. An SEO’s guide to XPath

另外,还可以通过 //meta[@property = 'og:image']/@content 获取网站的 ogImage。


SwiftUI 通过 XPath 显示网站信息
https://wonderhoi.com/2024/11/13/SwiftUI-通过-XPath-显示网站信息/
作者
wonderhoi
发布于
2024年11月13日
许可协议